The rise of generative artificial intelligence (AI) tools with large language models (LLM) such as ChatGPT, Bard, and Bing, has ignited a spirited discussion among Twitter and Reddit influencers. Dubbed the “AI hallucination debate,” it centers around concerns regarding the accuracy and reliability of information generated by these tools, fueling calls for ethical oversight and fact-checking, says GlobalData, a leading data and analytics company.
Smitarani Tripathy, Social Media Analyst at GlobalData, comments: “The debate on the phenomenon of hallucination of generative pretrained transformers (GPT) AI models by social media contributors is linked with the transparency and reliability of the analysis, academic writing, and potentially biased information being generated by these AI models.
“Twitter influencers have opined that these AI tools with LLMs are designed to approximate human language, not truth, and often contain half-truths, misremembered details, and plagiarism, confounding users and raising questions about the accuracy and reliability of AI. Sentiments of the contributors are mostly negative as they have highlighted the need for ethical oversight, fact-checking, and input from social scientists and ethicists to align AI systems with human values.”
Meanwhile, some influencers highlight the potential implications for healthcare, privacy, and information accuracy and argue that hallucinations in AI could lead to creativity and understanding human conversation.
Tripathy concludes: “Striking the right balance will not only address the challenges but also unleash the potential for creativity and improved understanding in human-AI interactions.”
Below are a few popular influencer opinions captured by GlobalData’s Social Media Analytics Platform:
“GLT-4 as brain surgeon. GPT-4 scored 83 percent on neurosurgery board exams, GPT-3.5 got 62 percent, Bard, 44 percent Even more intriguing, the paper measured hallucinations. Bard had a hallucination rate of 57 percent while GPT-4 was just 2 percent. That suggests a potential for real progress on made up answers.”
“3 / RE hallucinations—these LLMs do not have the ability to include image input yet. Despite not having this info, these LLMs confabulated by making up missing imaging information. On imaging-based q’s, Bard had a hallucination rate of 57 percent while GPT-4 had a rate of 2.3 percent.”
“I call BS. #GenerativeAI is designed to approximate human language but NOT to be truthful. These tools have been rushed out without vetting.”
“GPT hallucination is a feature, not a bug! People always confabulate. Half-truths and misremembered details are hallmarks of human conversation—confabulation is a signature of human memory. These models are doing something just like people.” #GeoffreyHinton #AI #ChatGPT #GPT4”
“for those not familiar, AI hallucination is when the AI starts making stuff up to fill in space within a discussion, or starts to believe that falsehoods are true in discussion. All AI’s I’ve used may do this to some degree, but Bard sorta be trippin’…”