August 2024

Tuning AI with Retrieval Augmented Generation (RAG)

Better and More Relevant Results With Fewer Hallucinations?

As artificial intelligence continues to evolve, the combination of information retrieval with natural language generation—known as Retrieval-Augmented Generation (RAG)—is emerging as a game-changer.

This innovative approach allows AI systems to generate more accurate, contextually relevant responses by accessing vast amounts of external information in real-time.

In this edition of the MangoChango Newsletter, we explore how RAG works, its potential applications, and the benefits it offers for organizations seeking to harness AI for more sophisticated tasks, from customer support to content creation.

what-is-retrieval-augmented-generation

What Is Retrieval-Augmented Generation, aka RAG?

By Rick Merritt

Retrieval-augmented generation (RAG) is a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources. It fills a gap in how LLMs work. LLMs respond to general prompts at light speed but they do not serve users who want a deeper dive into a current or more specific topic well.

RAG links generative AI services to external resources, especially ones rich in the latest technical details. RAG is “a general-purpose fine-tuning recipe” that can be used by nearly any LLM to connect with practically any external resource.

breaking-up-is-hard-to-do-chunking-in-rag-applications

Breaking Up Is Hard to Do: Chunking In RAG Applications

By Ryan Donovan

Building LLM-based applications may require you to root your LLM responses in your source data. Fine-tuning an LLM with your custom data may get you a generative AI model that understands your particular domain, but it still may be subject to inaccuracies and hallucinations.

This has led a lot of organizations to look into retrieval-augmented generation (RAG) to ground LLM responses in specific data and to back them up with sources. This is an interesting blog entry around these topics.

building-LLMs-for-production

Building LLMs For Production: Enhancing LLM Abilities and Reliability with Prompting, Fine-Tuning, and RAG

By Louis-Francois Bouchard

This book explores various methods to adapt “foundational” LLMs to specific use cases with enhanced accuracy, reliability, and scalability. Written by more than 10 LLM experts and curated by experts from Activeloop, LlamaIndex, Mila, and more, it is a roadmap to the tech stack of the future.

The book includes things like fundamentals of LLM Theory, LLM techniques and frameworks, and code projects with real-world applications.

use-rag-to-improve-responses-generative-ai

Use RAG to Improve Responses in Generative AI Applications

By Mani Khanuja

This is a one-hour in depth video around how generative AI applications can deliver better responses by incorporating organization-specific data through Retrieval Augmented Generation (RAG).

Implementing RAG requires connections to data sources, management of data ingestion workflows, and the writing of custom code to manage the interactions between the foundation model and the data sources. This video covers how to make the process easier using Amazon Bedrock.

MangoChango’s ability to deliver unquestionable value to its clients is highly dependent on keeping abreast of new technologies and trends. Our clients value this commitment to leading-edge thinking and expertise.

MangoChango’s engineers are experts in a wide variety of technologies, frameworks, tools, and languages, with an emphasis on continuous learning as new thinking, tools, and techniques come to market.

Check here for more information and to explore our technology assessment and maturity framework.