Technology Insights: Understanding and mitigating AI hallucinations with guardrails

Page 1


An introduction

Generative AI (GenAI) refers to a class of artificial intelligence models designed to produce content based on statistical probabilities derived from vast amounts of data from a wide variety of sources. GenAI’s output is created by interpreting user input, following detailed instructions embedded in the model, and, when available, integrating relevant data provided by the user or from various configured external sources. Most consumers don’t realize the intricate math underpinning these models, and instead think of GenAI as some sort of computer genius akin to a magician.

However, despite its sophistication, GenAI lacks true “understanding” as perceived by members of the general public who interact with public-facing models. It doesn’t truly comprehend information in a human sense, but rather, it operates on a probabilistic process, predicting the most plausible answer based on user input and the data encoded into the model during training. This means when users seek wisdom or knowledge from GenAI, the responses may be well-formed, but lack a deep conceptual understanding. Essentially, GenAI uses its data to provide the best possible response within its programmed capabilities. As a result, it offers a high probability guess, rather than a definitive, accurate answer.

GenAI’s current limitations

When GenAI handles customer-specific queries, it leverages provided data, whether from conversation history, given prompts, or data injected into the request. The latter process allows it to use customer datasets with companyor domain-specific knowledge to provide more accurate and relevant responses. Nevertheless, ensuring 100% adherence to the data provided is challenging. Variations in prompts, human interference, and the inherent content of the training dataset can all influence the model’s responses.

These discrepancies can lead to what is termed “AI hallucinations,” where the AI generates responses that, while plausible sounding, are factually incorrect or nonsensical. These hallucinations are responsible for some of the most critical—albeit funniest—headlines about AI, which take away from what GenAI is actually capable of.

GenAI

The generation process

To mitigate AI hallucinations, it’s important to understand how generative AI models produce the responses.

1 Phase one – User input

This is when a user provides an utterance, or question to the GenAI model. These inputs range from simple to complex, but the model must respond regardless of complexity. For instance, a user might ask a straightforward question like, “What time does the store close?” or a more complex one like, “How does quantum computing impact encryption?” In either case, the model begins processing by interpreting the query. The user query together with control instructions (or prompts) and additional (optional) facts (which are retrieved from enterprise data sources using a process called retrievalaugmented generation) are going through a tokenization phase, which is encoding this text into a numeric format understood by the AI model. If user question or prompts are vague and non-specific, the model might misinterpret the contextual meaning of certain words, leading to the production of hallucinated output downstream.

2 Phase two – Token generation

The model generates multiple variations of an answer based on statistical probabilities encoded into the model during its training process. In this phase, the model determines the core intent of the query and aligns it with stored data to produce a uniform response based on provided instructions and facts. In cases where the model lacks specific information about the subject of the query and additional facts were not provided to the model, it might generate a plausible but inaccurate response based on generic data encoded into the model during training.

Phase three – Answer ranking 3

The model re-ranks possible answers from the previous step based on API query parameters. One of these parameters, called temperature, is responsible for a degree of “creativity” of the responses provided by the model. The higher the value, the more “creative” (or statistically less frequent) the answer is. When this parameter is set to higher values, hallucinations may occur, resulting in plausible but incorrect responses. Ideally, the responses to identical prompts should be consistent. For example, no matter how a user phrases, “What time does the store close?” (e.g., “When’s closing time?” or “How long are you open?”), the answer should be the same and accurate. Adjusting the temperature parameter can help with mitigating this class of AI hallucinations.

So, what is a hallucination?

AI hallucinations are instances where, due to a lack of reliable answer data, the model fabricates responses. These responses may sound reasonable but deviate from the actual provided data or expected answers. For instance, an AI model might invent a historical fact, create a fictional statistic, or provide an incorrect answer based on misunderstood context, all in an effort to produce any answer at all. This occurs because the AI model is trained to respond regardless of certainty, leading to potentially misleading or false information.

Problems caused by hallucinations and potential repercussions

AI hallucinations can lead to several significant issues:

1. Erosion of user trust

Users may lose confidence in the AI system if it consistently provides unreliable information. Trust is crucial for the adoption and success of AI technologies, just like the adoption of any type of new technology, and repeated inaccuracies or failures can lead to skepticism and reluctance to use AI solutions.

2. Misinformation dissemination

Incorrect information can spread quickly, leading to confusion and potentially harmful consequences. For example, an AI might generate an incorrect medical recommendation based on misunderstood input, which, if acted upon, could harm the user.

3. Compromised decision-making

Decisions based on inaccurate data can lead to suboptimal or harmful outcomes. In business, relying on faulty AI-generated data could result in poor strategic decisions, financial losses, or missed opportunities.

Repercussions can be severe:

Financial losses

Businesses may incur financial losses due to decisions made based on erroneous information. For instance, an AI’s incorrect market analysis might lead a company to invest in an unprofitable venture.

Reputational damage

Trust and credibility with customers can be severely impacted, damaging long-term relationships. Once a brand is associated with unreliable AI tools, it can be challenging to ever fully regain customer trust.

Potential legal and ethical repercussions

Financial or legal advice based on AI-generated misinformation could result in lawsuits and regulatory penalties. This is particularly relevant in highly regulated sectors like healthcare, finance, and legal services.

The solution: Implementing guardrails

To mitigate the risks associated with AI hallucinations, organizations should implement what are known as “guardrails.” Guardrails are mechanisms or strategies designed to prevent AI from producing erroneous or misleading outputs, ensuring the consistency and accuracy of AI responses.

Guardrails set boundaries for AI models, using verified data to ensure accurate responses. They can take several forms, including:

Specific large language model (LLM) API query parameters

Defining strict parameters for API queries to limit the scope of potential responses. This involves setting clear guidelines for what degree of response content variability is acceptable, thus reducing the risk of hallucinations.

Prompt techniques

Utilizing prompt specificity or re-prompting methods to guide the AI toward accurate answers. By framing prompts in a way that the AI can easily interpret and respond to accurately, the likelihood of hallucinations is minimized.

Retrieval-augmented generation

Grounding responses in customer-specific domain data through vector search, knowledge graphs, semantic re-ranking, and synthetic datasets. This approach ensures that the AI relies on a solid foundation of accurate, relevant data when generating responses.

Multi-step querying

Breaking down complex queries into simpler steps to improve accuracy. It involves processing the query in stages, allowing for more precise and accurate responses.

Model architecture and mix of experts

Enhancing precision by using specialized models for different types of queries. This involves employing various models, each an expert in a specific domain, to handle different aspects of the query.

By employing a mix of these methods, organizations will have multiple layers of protection in place, significantly reducing the risk of hallucinations.

Guardrails in action

One organization partnered with IntelePeer to enhance its customer interaction system using guardrails. The process involved several steps. First, the organization provided its knowledge base and customer data. This data served as the foundation for the AI’s responses, ensuring they were grounded in accurate, relevant information. IntelePeer determined common questions from the data and clustered them by topic, and then each cluster was assigned a correct answer. This unique clustering system allowed the AI to quickly and accurately respond to a wide range of queries based on predefined, verified answers. The language model then generated multiple permutations of the questions within each cluster, ensuring that different requests yielded the same answer. This step addressed the issue of varied phrasing, ensuring consistency in responses.

The clusters were designed to grow and evolve based on real-time feedback and new queries, covering 75-80% of customer interactions. This dynamic nature ensured that the system remained accurate and relevant over time. When a user prompt didn’t match any cluster permutation, it was either identifiable enough to be appropriately filed or sent to an agent for resolution. This system significantly reduced the occurrence of AI hallucinations and improved response accuracy.

GenAI

Implementing guardrails in organizations

Organizations can implement similar guardrails in their own organizations—and they need to do so sooner rather than later. First, they should develop comprehensive, dynamic knowledge bases from customer interaction data. This involves continuously monitoring and analyzing how customers interact with the AI to identify common queries and responses, which, while labor intensive, is worth the work. This isn’t a “one and done” process—organizations should be constantly updating the system to adapt to new queries and maintain accuracy. As customer needs and questions evolve, so must the AI’s knowledge base to ensure it continuously evolves and remains highly relevant and accurate.

To this end, staff overseeing AI should also be trained to recognize and manage unidentifiable permutations, facilitating seamless escalation and resolution processes. Staff should be equipped to handle exceptions and unusual queries that the AI cannot accurately address, and then should be logging those queries so that they can be identified and addressed in the future.

By understanding GenAI’s limits and preparing for inevitable issues and circumventing them ahead of time, organizations will see the full potential of these advanced technologies. When implementing guardrails, the reliability and accuracy of AI-generated responses are crucially enhanced. Continuous effort is required to maintain and update these systems to ensure they evolve with changing customer interactions and data. However, it’s not just a worthwhile endeavor but a critical initiative, because by doing so, organizations can mitigate risks, build user trust, and leverage GenAI effectively.

See how IntelePeer can specifically help your organization improve communications and CX.

Harness the power of generative AI with IntelePeer

IntelePeer simplifies communications automation for businesses and contact centers by providing solutions that are rapidly customizable with advanced voice, messaging, and self-service technologies that create tailored customer engagements supported by AI and in-depth analytics. We provide simple, easy-to-use tools that can be utilized by anyone — enabling us to deliver industry-leading time to value and seamless integration with your infrastructure. For more information, visit intelepeer.ai

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.