When Recommendation Systems Stop Recognising Quality, Noise Becomes Normal

Research showing that over one-fifth of videos shown to new YouTube users are AI-generated filler reveals how optimisation systems drift when quality is weakly defined.

A study published in late 2025 found that more than 20 percent of videos recommended to new YouTube users fall into a category researchers described as “AI slop”. Low-quality, auto-generated content designed primarily to attract attention rather than convey insight, originality, or expertise.

The immediate reaction to this finding has been to blame artificial intelligence. That response is understandable, but it obscures the more important issue. What this research exposes is not a sudden failure of technology but a long-standing weakness in how large digital systems define and reward value.

This is a story about optimisation without judgement. AI has not corrupted a healthy system. It has accelerated tendencies that were already embedded in how attention platforms work.

What the Research Actually Shows

The analysis, conducted by Kapwing, focused specifically on the experience of brand new YouTube users. Researchers created fresh accounts and observed the videos recommended before any meaningful viewing history had been established. This cold start phase is significant because it exposes how recommendation systems behave when they lack contextual signals and must rely on broad, generalised indicators of relevance.

What emerged was not a marginal anomaly but a consistent pattern. More than one fifth of the videos surfaced to new users were identified as low-quality, AI-generated filler, content produced primarily to capture attention rather than convey insight or originality. The scale of this behaviour is substantial.

The research reviewed thousands of high-performing channels and identified hundreds dedicated almost entirely to automated content, collectively accounting for tens of billions of views.

This is not incidental noise at the edges of the platform. It is activity concentrated at the centre of the recommendation system, where visibility is highest and incentives are strongest.

❝

As journalist Max Read has observed, “There are these big swathes of people on Telegram, WhatsApp, Discord and message boards exchanging tips and ideas [and] selling courses about how to sort of make slop that will be engaging enough to earn money,” underscoring how organised and economically motivated this ecosystem has become.

The findings challenge the assumption that low-quality AI content is a fringe problem. Instead, they reveal how optimisation systems, when deprived of strong quality signals, can systematically elevate content that is cheap to produce, easy to replicate, and effective at triggering shallow engagement.

Why New Users See Disproportionately Poor Content

The most revealing aspect of this story is not that low-quality AI content exists, but that it is disproportionately visible to new users. This is not accidental. It is a consequence of how recommendation systems behave under uncertainty.

When a platform has no prior data about a user, it cannot infer taste, credibility thresholds, or trust calibration. In that situation, algorithms fall back on proxy metrics that are easy to observe and quick to accumulate. Clicks, watch time, repetition, and emotional intensity become stand-ins for relevance.

AI-generated content is unusually effective at exploiting these proxies. It is cheap to produce, easy to vary, and optimised for familiarity and emotional hooks. When marginal production cost collapses, volume itself becomes a strategy. In a cold-start environment, repetition can look like popularity, and popularity can be mistaken for quality.

From a systems perspective, this outcome is not surprising. When quality is difficult to measure and trust signals are absent, optimisation systems default to what they can see. Engagement becomes the currency, even when it correlates poorly with value.

The Economics Behind the Flood

It is tempting to frame “AI slop” as bad behaviour by irresponsible creators. That framing misses the structural incentive at work. Generative tools have dramatically reduced the cost of producing content, while the mechanisms for evaluating quality remain slow, subjective, and expensive.

This gap creates an economic opportunity. Content that is fast to produce and good enough to trigger shallow engagement can outperform slower, more thoughtful work, at least in the short term. The fact that some of the most prolific slop channels generate millions in advertising revenue is not evidence of moral failure. It is evidence of a system rewarding what it measures.

We see a familiar pattern. Whenever production capacity scales faster than evaluation capacity, noise increases. AI has simply widened that gap to the point where it is no longer easy to ignore.

Why the Literacy Question Matters More Than It Appears

Some commentary has focused on the idea that users either cannot tell AI-generated filler from substantive content or simply do not care. This observation is often framed as a consumer education problem.

That interpretation is incomplete. The more concerning issue is not individual discernment, but collective signal erosion. When low-quality material dominates attention channels, expectations shift. Over time, users recalibrate what they consider normal, credible, or worth attention.

This dynamic has implications far beyond entertainment platforms. In professional environments, teams increasingly rely on AI-generated summaries, automated insights, and algorithmically prioritised information. If those systems optimise for activity rather than understanding, organisations risk internalising the same pattern seen on public platforms.

High output does not guarantee high insight. Without strong quality signals, it often masks the opposite.

The Delivery Parallel Most Organisations Miss

For project delivery professionals, this story should feel uncomfortably familiar. Many organisations are adopting AI tools with vague success criteria. Speed, coverage, and apparent productivity are celebrated, while judgement, relevance, and decision quality are harder to articulate and therefore easier to ignore.

The YouTube example is not an outlier. It is a mirror. When teams commission AI systems without clearly defining what good looks like, optimisation fills the gap. The result may look efficient while quietly degrading trust and usefulness.

This is not a tooling problem. It is a commissioning problem.

Why Governance Has to Start Earlier

Calls for stricter moderation or content controls focus on the visible symptoms, not the cause. By the time moderation becomes necessary, incentives are already entrenched.

Quality is shaped much earlier, when objectives are set, metrics are chosen, and trade-offs are accepted. Governance that operates only at the output stage is reactive by design.

From a delivery standpoint, governance must move upstream. It must inform how success is defined, how performance is measured, and what compromises are explicitly ruled out. Without that discipline, AI systems will optimise faithfully but shallowly.

Human Judgement as a System Requirement

A recurring misunderstanding in discussions about AI failure is the assumption that better models will solve the problem. In many cases, the issue is not insufficient intelligence but insufficient judgement embedded in system design.

AI does not question objectives. It does not recognise when metrics stop correlating with value. It does not know when optimisation has crossed into distortion. Those responsibilities remain human.

From our perspective, the role of human judgement is shifting upstream, away from individual outputs and toward system architecture. Deciding what to optimise is becoming more important than optimising well.

What This Story Is Really Signalling

The rise of AI-generated filler in recommendation feeds is not the end state. It is an early signal of what happens when powerful generation meets weak evaluation. It shows how quickly systems drift when incentives are misaligned and literacy lags capability.

The lesson is not to fear AI adoption. It is to take system design seriously. Outcomes will reflect the clarity of objectives, the strength of quality signals, and the willingness to intervene early.

Closing Reflection

From our perspective, the YouTube findings are not a cautionary tale about bad content. They are a warning about weak systems.

AI produced exactly what it was rewarded for producing. If quality disappeared, it is because quality was never clearly encoded into the system in the first place.

As AI becomes embedded across delivery environments, this distinction will matter more, not less. Tools will continue to improve. The differentiator will be whether organisations have the discipline to define value precisely and govern for it deliberately.

Quality does not disappear by accident. It erodes when systems reward volume, speed, and shallow engagement more reliably than judgement.

For delivery leaders, the task is not to criticise algorithms, but to recognise the same optimisation patterns inside their own organisations. Examine where output is being mistaken for value and where AI is amplifying noise rather than insight.

Engage with Project Flux to stay ahead of how literacy, governance, and delivery practice must evolve together as AI becomes normal rather than novel.