
A new international study has found that AI models such as ChatGPT, Google Gemini, Microsoft Copilot, and Perplexity AI frequently misrepresent or distort news events, with nearly half of their responses to news-related questions containing significant factual or sourcing issues.
The report, published on Wednesday by the European Broadcasting Union (EBU) and the BBC, raises growing concerns about the reliability of artificial intelligence tools as news sources in an era of rapidly expanding AI-generated content.
AI Models Fail Accuracy Test in Global Media Study
Researchers assessed more than 2,700 AI-generated answers to news-based questions submitted by 22 public media outlets representing 18 countries and 14 languages. The tests were conducted between late May and early June 2025, with journalists and fact-checkers comparing each response against verified public information and cited sources.
The results were stark: 45% of all AI responses contained at least one “significant” issue, ranging from factual inaccuracies and incorrect sourcing to missing context.
“They have not prioritised this issue and must do so now,” wrote Jean Philip De Tender, EBU Deputy Director General, and Pete Archer, the BBC’s Head of AI, in the report’s foreword. “They also need to be transparent by regularly publishing their results by language and market.”
Sourcing Problems Top the List of AI Errors
The study found that the most frequent problem involved sourcing errors, with 31% of responses containing:
- Unsupported or fabricated references,
- Incorrect attributions, or
- Information that could not be verified.
This suggests that AI models often “hallucinate” sources — a phenomenon where the system invents citations or misrepresents the origin of information to appear more credible.
The second most common problem involved accuracy, affecting 20% of all responses, while 14% lacked essential context, leading to misleading or incomplete explanations of real events.
Google’s Gemini Ranked the Least Reliable
Among the four models tested, Google’s Gemini performed the worst, with 76% of its responses showing significant sourcing issues, according to the report.
All models tested — including ChatGPT (OpenAI), Copilot (Microsoft), and Perplexity AI — were found to make basic factual errors, reinforcing the widespread criticism that AI systems remain unreliable for delivering verified news.
Some of the notable factual blunders cited include:
- ChatGPT referring to Pope Francis as the sitting pontiff months after his reported death (a fabricated event).
- Perplexity AI incorrectly claiming that surrogacy is illegal in the Czech Republic.
These errors highlight how AI models may confidently present false or outdated information without clear disclaimers, potentially misleading users who rely on them for timely updates.
AI Companies Under Pressure to Prioritize Accuracy
The findings have intensified calls for greater transparency, accountability, and safety testing in AI model development. Despite their growing popularity, OpenAI, Google, Microsoft, and Perplexity have been criticized for prioritizing innovation and speed over factual reliability.
The EBU and BBC urged tech firms to publish regular accuracy audits, particularly broken down by language and regional market, to help the public understand where and how these models perform best — or fail most.
“They need to make accuracy a priority, not an afterthought,” De Tender and Archer emphasized.
Experts Warn of Rising Misinformation Risks
Jonathan Hendrickx, Assistant Professor of Media Studies at the University of Copenhagen, said the study underscores the urgent need for media literacy education in the age of AI.
“The rise of disinformation and AI-generated content more than ever blurs the boundaries of what is real or not,” Hendrickx told Al Jazeera. “This poses a major issue for media practitioners, regulators, and educators that needs urgent care and attention.”
Hendrickx and other academics warn that as generative AI tools become mainstream, they risk spreading inaccuracies faster and wider than traditional misinformation sources, especially when used to summarize breaking news or social media trends.
AI, Journalism, and the Struggle for Trust
The study’s findings come as many news organizations experiment with AI-assisted content generation, from automated summaries to translation and headline creation. However, the results suggest that AI still lacks the contextual understanding, verification discipline, and ethical judgment that define professional journalism.
Experts argue that AI should supplement, not replace, human editors and reporters, especially when covering sensitive or evolving stories such as wars, political crises, and scientific developments.
The EBU-BBC report calls for collaboration between tech firms and public media organizations to improve the accuracy and transparency of AI outputs, especially in multilingual contexts where errors can multiply.
Public Awareness Key to Responsible AI Use
While AI models like ChatGPT and Gemini are increasingly used by millions of people for everyday queries, the study serves as a reminder that their answers should be treated as starting points — not verified facts.
Media experts recommend that users:
- Cross-check AI information with credible sources.
- Avoid relying on AI summaries for breaking news.
- Stay informed about AI transparency reports when available.
As Hendrickx noted, fostering critical thinking and media literacy from a young age is essential to navigating a future where AI-generated misinformation could shape public understanding of global events.


Leave a Reply