How Microsoft’s artificial intelligence chatbot ‘hallucinates’ election data

Microsoft's artificial intelligence chatbot

The approaching year is set to witness a flurry of democratic events, with significant elections scheduled in the US, the EU, Taiwan, and various other locations. Concurrently, the acceleration of the generative AI era raises concerns among some about its potential adverse effects on the democratic process.

A primary apprehension regarding generative AI revolves around its potential misuse for spreading malicious disinformation. There is a fear that these models might fabricate false statements, referred to as hallucinations, and contribute to the distortion of facts. A joint study by Algorithm Watch and AI Forensics revealed that Microsoft’s chatbot, Bing AI (powered by OpenAI’s GPT-4), provided inaccurate answers to one-third of questions related to elections in Germany and Switzerland.

The research, which aimed to assess the chatbot’s responses to candidates, polling, voting information, and general recommendations, highlighted a significant issue. Salvatore Romano, Senior Researcher at AI Forensics, emphasized that the study demonstrated how general-purpose chatbots, not just malicious actors, pose a threat to the information ecosystem. Romano urged Microsoft to acknowledge the problem, stating that merely flagging generative AI content created by others is insufficient, as even content from reputable sources produced incorrect information on a large scale.

The study identified errors in the chatbot’s responses, ranging from inaccurate election dates and outdated candidates to fabricated controversies about candidates. Notably, the misinformation often bore the name of a trustworthy source that had accurate information on the given topic. Furthermore, the chatbot generated fictional stories about candidates engaging in scandalous behavior, falsely attributing the information to reputable sources.

On occasions, Microsoft’s AI chatbot skillfully avoided answering questions it lacked information on. However, instead of remaining silent, it frequently opted to invent responses, including concocting allegations of corruption.

The data for the study was gathered between August 21, 2023, and October 2, 2023. Upon sharing the findings with Microsoft, the technology giant expressed a commitment to addressing the issue. Nevertheless, a month later, subsequent samples revealed comparable outcomes.

Microsoft’s press office was unavailable for comment before the publication of this article. Nonetheless, a company spokesperson informed the Wall Street Journal that Microsoft was actively working to resolve issues and enhance its tools for the upcoming 2024 elections. Simultaneously, users were encouraged to exercise their “best judgment” when reviewing results from Microsoft’s AI chatbot.

Riccardo Angius, Applied Math Lead and Researcher at AI Forensics, argued against describing these errors as mere “hallucinations.” According to Angius, their research sheds light on the more intricate and systematic occurrence of misleading factual inaccuracies in general-purpose LLMs and chatbots.

Share this on