Blog

A quick guide to Retrieval Augmented Generation (RAG)

February 28, 2025

Scott Hutchins

Author

Scott Hutchins

In Five key factors blocking widespread AI implementation in organizations, we shared the highlights of our Service Management Unlocked (SMU) webinar. This article is part of a series outlining various terms and concepts associated with artificial intelligence (AI) that were referenced during the webinar and summary blog post summary.

Read other blog posts in this series:
>> A quick guide to Supervised Machine Learning

———

Everyone is talking about AI.

As I type this sentence, I have 3 “to-be-read-soon” AI-related articles in another tab, all published within the hour.

“AI cracked a superbug problem in two days that took scientists years” (BBC).
“A.I. Is Changing How Silicon Valley Builds Start-Ups” (NY Times).
“Spotify Looks to Expand AI-Narrated Audiobooks On Its Platform” (The Hollywood Reporter).

Everyone is talking about AI.

So is Xurrent. AI is an essential part of what we do, showing up in the features and functionality of our ITSM platform.

In fact, due to Xurrent’s investment in secure AI, over 90% of Xurrent’s customers have already enabled AI and are realizing incredible productivity gains.

But as we shared in A quick guide to Supervised Machine Learning, we don’t want to add more noise: our AI products are authentic, creating real impact. They are sophisticated yet easy to understand.

But we also want to be sure we explain some of the key AI-related terminology and how we use it at Xurrent.

This article will do just that, as it relates to Retrieval Augmented Generation (RAG).

What is Retrieval Augmented Generation (RAG), and what are its main benefits?

Retrieval Augmented Generation (RAG) is a hybrid approach to machine learning, not fitting neatly into either supervised or unsupervised learning.

RAG is an AI architecture that gives large language models (LLMs) the ability to retrieve and reference external information when generating responses.

This “retriever” searches through and fetches relevant information from a knowledge base containing indexed documents, facts, or other information.

The “generator” (often an LLM) uses the retrieved information and the user’s query to produce responses.

The key benefits of leveraging a RAG framework include:

Enhanced accuracy: Responses are sourced from specific retrieved information, not just models
Up-to-date information: Not limited to the model’s training cutoff
Less chance of hallucination: The model has concrete information to reference
Greater transparency: Sources are cited
Dynamic knowledge incorporation: No need to retrain the entire mode

Xurrent uses RAG to simplify our customers’ workload.

How Xurrent uses RAG (+ a real-world example)

As discussed here, RAG leverages a large language model (LLM) trained, at a great expense, by Anthropic — a company that trains LLMs with massive public data sets but not Xurrent customer data. The pre-trained Anthropic models have no knowledge of customer intricacies, so when asking the LLM questions (aka, prompt engineering), we share relevant details in the prompt.

Why do we use this approach? Simply put, a custom-trained supervised machine learning model is lightning-fast and really inexpensive to run. We pass this saved expense to Xurrent customers — low costs and high performance.

The alternative — a generic LLM like ChatGPT — means “paying” for a lot of irrelevant knowledge and potential processing when applying it to ITSM use cases.

Let’s look at a real-world example (dating) to help us better understand RAG:

Your best friend has just started dating someone and is asking for your opinion about them. How do you go about formulating your opinion?

You search them up for any publicly available information. You find them on LinkedIn and Facebook, but (a) that’s a lot of work, and (b) you are likely missing some very critical information).

How do you get that “custom data?” You’d have to check with your best friend … in a “private” conversation. The information you learned during your chat is private, but the pre-trained Anthropic model data and the private data you provided within the prompt can both be leveraged.

Congratulations. You’ve now used RAG to provide solid feedback to your best friend about their new partner (and possibly disrupted the dating app industry).

Stay tuned for more on the when, where, and how of Retrieval Augmented Generation (RAG) at Xurrent.