The HARG Truth: AI’s Need for the Human Element

Valtteri KarestoSeptember 9, 2023
While Retrieval Augmented Generation (RAG) uses machine algorithms to retrieve and generate information, Human-Augmented Retrieval Generation (HARG) incorporates a human review step for added accuracy and contextual relevance. This approach not only minimizes machine errors but also aligns the output more closely with human understanding.

Introduction

In the evolving landscape of AI-powered systems, combining human intuition with machine efficiency can create robust and reliable solutions. The Human-Augmented Retrieval Generation (HARG) method builds upon the Retrieval Augmented Generation (RAG) model, integrating a crucial human touch to the pipeline.

To understand HARG, it's essential to first understand how RAG operates:

  1. Query: For example, User asks a question
  2. Retrieval Step: Question is parsed and relevant documents related to the question are retrieved
  3. Documents and original query: Documents and the original prompt are fed to the language model
  4. Response: Answer is generated based on the documents and original query

This is a distilled overview. Each step contains its own intricacies and nuances, but the basic framework is as outlined above.

While RAG boasts numerous promising applications, it sometimes falls short. There are instances where the retrieved documents might be similar to the query but not strictly relevant. For instance, if someone inquires about Manchester United's performance this season, and the system retrieves documents related to the seasons '74-'75, '78-'79, and '87-'88, the response would be imprecise. A human, upon reviewing, would likely notice this discrepancy and adjust the query accordingly (add the current year to it) or pick manually correct documents as context.

Adding a human step

HARG is designed for knowledge-intensive tasks that not only rely on accurate retrieval of information but also human judgment to select the most appropriate context. Unlike RAG, which automatically concatenates retrieved documents as context, HARG proposes a step where a human reviews the suggestions made by the retrieval component. This ensures that the selected context is both relevant and appropriate, thereby further reducing the chances of “hallucination” or generation of incorrect or irrelevant information.

Here’s how HARG operates:

  1. Query: For example, User asks a question
  2. Retrieval Step: Just like RAG, HARG retrieves a set of relevant/supporting documents from a source (e.g., Wikipedia) based on the input.
  3. Human Selection Step: Instead of automatically feeding the retrieved documents to the generator, a human expert reviews and selects the most pertinent context from the suggestions.
  4. Documents and original query: Documents and the original prompt are fed to the language model
  5. Response: Answer is generated based on the documents and original query

The inclusion of the human element in HARG serves a dual purpose: enhancing reliability by minimizing machine errors and ensuring the context aligns well with human intuition and understanding.

With the growing emphasis on human-in-the-loop AI systems, HARG bridges the best of both worlds, ensuring efficiency and relevance while maintaining the adaptability of retrieval-based generation models.

This HARG concept provides an additional layer of verification, ensuring more accurate and contextually appropriate responses.

Optimal use-cases for HARG

HARG might not be the optimal solution for use cases where the user is purely searching for answers to questions. The user might not know which documents are relevant to the query. Prominent use cases for HARG lie in co-pilot-like applications, where the user is generating something, e.g., code or parts of legal documents. In these cases, the user usually has some knowledge of whether the retrieved documents are relevant and contain answers.

One use case would be a helper tool for a tech support operator. The operator might have a traditional chat UI where they have a conversation with users. While chatting with a user, a HARG-enabled agent might analyze the conversation, fetch relevant documents based on user information, questions, etc., and surface them on the UI for the Operator. The Operator, on the other hand, can pick and choose relevant documents and ask the Agent to generate possible answers to users' questions based on human-augmented context.

While doing this, all the generated question/answer pairs can be stored to later improve the agent itself, for example, by fine-tuning. The same logic would apply to numerous co-pilot-like applications.

Continue reading

Thoughts on the Model Context Protocol Part 2

In the previous post we introduced some concerns around Model Context Protocol, things have updated quite a bit, here's a quick review.

Valtteri Karesto

Valtteri Karesto

CTO, Founder

Thoughts on the Model Context Protocol

Model Context Protocol (MCP) is a new open protocol that aims to standardize how AI applications connect to custom data sources (files, photos) and tools (eg. functions that can fetch inform from third party systems). It was released by Anthropic, but in theory, any AI application could support it. At the moment, Claude Desktop and Cursor are among the most popular applications that support MCP. This means that third party developers can build custom tools and other capabilities host applications like Claude Desktop can use.

Valtteri Karesto

Valtteri Karesto

CTO, Founder

Synthetic Data in 2025: Revolutionizing GenAI Model Performance

How Synthetic Data is Powering the Next Generation of Efficient, Specialized AI Models

Valtteri Karesto

Valtteri Karesto

CTO, Founder

Optimizing current business vs. unlocking new business opportunities using GenAI

Transformative impact of Generative AI, particularly Large Language Models, on business optimization and the creation of new opportunities, highlighting its applications in various industries like real estate and the importance of exploring beyond current business models to fully leverage AI's potential

Tuomas Martin

Tuomas Martin

VP of Sales

Founder Conversations: A Week of LLM Insights in the Bay Area

We spent a week talking to founders and builders at Ray Summit, TechCrunch Disrupt, and various Bay Area GenAI meetups to understand the challenges they face when building value-providing LLM-based apps.

Joni Juup

Joni Juup

CDO, Founder

Intentface: Human-Centric Computing Through Intent-Driven Interactions

Imagine a future where clicking dropdowns, filling input fields, and browsing through abstract data visualizations are things of the past. That's the future we can build with intentfaces.

Joni Juup

Joni Juup

CDO, Founder