How to Keep Your Chatbots Honest and Accurate – With Guardrails for AI Hallucinations

Introduction

As Generative AI applications proliferate in the enterprise and the classroom, it’s crucial to understand risks like AI Bias and hallucinations and approaches to responsibly mitigate them. In this post we take a closer look at some strategies to minimize the risks of AI Hallucinations.

What are Hallucinations?

Hallucinations refer to when AI systems like ChatGPT generate responses that are factually incorrect or entirely fabricated despite sounding plausible. For example, attributing fake quotes, citing made-up statistics, or describing events that never occurred.

Causes of Hallucinations

Hallucinations stem from limitations in the statistical training process. AI models find patterns but don’t truly comprehend the meaning. They optimize to produce seemingly reasonable responses, sometimes resulting in false information.

Examples of Hallucinations

  • Stating historically inaccurate dates, figures, or events
  • Fabricating logical sounding but fictional people, places, concepts
  • Presenting opinions as facts without sources
  • Providing unsafe or unethical advice in domains like medicine

Hallucinations vs Lying

It’s crucial to recognize models lack intent or agency. They do not deliberately “lie” out of self-interest.  It is worthwhile to dig a bit deeper into the causes here.

At a technical level, hallucinations stem from the statistical nature of how LLMs are trained. Models like GPT-3 analyze vast datasets to detect probabilistic patterns between words and texts. This allows them to generate plausible continuations for prompts.

However, the models lack any true conceptual understanding of the words. They have no grounding in the factual accuracy of the statements they generate. Their objective is simply to produce responses that statistically appear truthful and coherent.

So, when a prompt leads the model into territory with gaps, inaccuracies or false premises, the model has no inherent mechanism to discern that and course-correct. It will seamlessly continue generating fluent text even if the content has deviated into fiction.

Unlike humans, the model has no inherent sense of “truth” vs “lies.” Any hallucinations are unintentional side effects of the statistical training paradigm rather than deliberate deceit. However, it remains crucial that we recognize this propensity exists and account for it through sound prompts and verification.

AI-generated hallucinations could become a growing concern for educators as chatbots enter the classroom. This guide covers risks, causes, examples and proven techniques to promote truthfulness.

Risks of AI Hallucinations in Education

As AI adoption in the classroom accelerates, educators are understandably concerned about the risks of AI Hallucinations impacting their ability for large scale adoption without adequate checks and balances in place.

Unchecked hallucinations pose multiple dangers for students including:

  • Learning and spreading false information that jeopardizes competency
  • Plagiarizing or citing incorrect data affects academic integrity
  • Unethical or dangerous health/safety advice
  • AI reflecting and reinforcing biases via false claims

Vivid Examples of Hallucinations

  • Inventing historical events, figures or dates confidently
  • Fabricating logical sounding scientific laws or chemical properties
  • Presenting opinions on social issues as factual without substantiation
  • Providing unsafe medical recommendations or advice

Sourcing claims, admitting uncertainty, and verifying prevents passing fiction as truth. It upholds rigor, provides teachable moments on critical analysis, and discourages blind reliance on AI.  Adopting Prompt Engineering best practices can help in mitigating the risks of AI Hallucinations.

Prompt Engineering Best Practices to Minimize Hallucinations.

Carefully crafted prompts can encourage more truthful, verified responses from AI systems.

  1. Require citing credible sources

“Please cite your sources for the key statistics you have included.”

  1. Reward transparency about uncertainty

“I appreciate you clearly indicating which parts of your response are established facts vs. uncertain.”

  1. Avoid false premises

Correct: “What is the evidence that Einstein had a first wife that influenced his work?”

Not recommended: “How did Einstein’s first wife contribute to his theory of relativity?”

  1. Flag unverified statements

“Could you highlight any information you are not fully certain about for me to verify separately?”

  1. Tasks validation against external data

“Verify these historical dates against credible sources and revise any inaccurate information.”

Setting expectations through rigorous prompting is key to minimizing risks. Continual reinforcement of truthfulness norms can encourage AI to exhibit greater care and transparency. The key is recognizing gaps, implementing safeguards, emphasizing transparency, and upholding rigorous standards. With prudent oversight, AI can still provide significant educational value.

What prompts do you use to keep your AI copilots honest? Share your hallucination busters just in time for Halloween !👻👻