

# Building RAG systems with Amazon Nova
Building RAG systems

**Note**  
This documentation is for Amazon Nova Version 1. Amazon Nova 2 is now available with new models and enhanced capabilities. New features and documentation updates are published in the Amazon Nova 2 User Guide. For information, visit [What's new in Amazon Nova 2](https://docs.aws.amazon.com/nova/latest/nova2-userguide/whats-new.html).

Retrieval-Augmented Generation (RAG) optimizes the output of a large language model (LLM) by referencing an authoritative knowledge base outside of its training data sources before it generates a response. This approach helps give the model current information and ground it in domain-specific or proprietary data. It also provides a controllable information source, which you can use to set access controls to specific content and troubleshoot issues in the responses.

RAG works by connecting a *generator* (often an LLM) to a content database (such as a knowledge store) through a *retriever*. The retriever is responsible for finding relevant information. In most enterprise applications, the content database is a vector store, the retriever is an embedding model, and the generator is an LLM. For more information, see [Retrieval Augmented Generation](https://aws.amazon.com/what-is/retrieval-augmented-generation/) and [Bedrock Knowledge Bases](https://docs.aws.amazon.com/bedrock/latest/userguide/kb-how-it-works.html).

A RAG system has several components. This guide focuses on how to use Amazon Nova as an LLM in any RAG system.

You can use Amazon Nova models as the LLM within a Text RAG system. With Amazon Nova models, you have the flexibility to build a RAG system with Amazon Bedrock Knowledge bases or build your own RAG system. You can also associate your knowledge base with an Agent in Amazon Bedrock Agents to add RAG capabilities to the Agent. For more information, see [Automate tasks in your application using conversational agents](https://docs.aws.amazon.com/bedrock/latest/userguide/agents.html).

**Topics**
+ [

# Using Amazon Bedrock Knowledge Bases
](rag-br-knowledge.md)
+ [

# Building a custom RAG system with Amazon Nova
](rag-building.md)
+ [

# Using Amazon Nova for Multimodal RAG
](rag-multimodal.md)

# Using Amazon Bedrock Knowledge Bases


Amazon Nova Knowledge Bases is a fully managed capability that you can use to implement the entire RAG workflow from ingestion to retrieval and prompt augmentation—without building custom integrations to data sources and managing data flows.

To use Amazon Nova models with Bedrock Knowledge bases, you must first [create a knowledge base](https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-create.html) and then [connect to your data repository for your knowledge base](https://docs.aws.amazon.com/bedrock/latest/userguide/data-source-resource.html). Next, you can [test your knowledge base with queries and responses](https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-test.html). Then you're ready to [deploy your knowledge base for your AI application](https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-deploy.html).

To customize steps in the process, see [Configure and customize queries and response generation](https://docs.aws.amazon.com/bedrock/latest/userguide/kb-test-config.html#kb-test-config-prompt-template).

# Building a custom RAG system with Amazon Nova
Building a RAG system

**Note**  
Amazon Nova Premier is not yet available via the [RetrieveAndGenerate](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) API. To use the [RetrieveAndGenerate](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) API with Amazon Nova Premier, you need to provide a custom prompt when calling the [RetrieveAndGenerate](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) API. This is done by supplying the `promptTemplate` in the `generationConfiguration` argument in the [RetrieveAndGenerate](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) API call as shown below:  

```
'generationConfiguration': {
                        'promptTemplate': {
                            'textPromptTemplate': promptTemplate
                        }
                    }
```
To build a custom prompt template, see [prompting guidance for RAG](https://docs.aws.amazon.com/nova/latest/userguide/prompting-tools-rag.html).

You can use Amazon Nova Models as the LLM within a custom text RAG system. To build your own RAG system with Amazon Nova, you can either configure your RAG system to query a knowledge base directly or you can associate a knowledge base with an Agent (for more information see [Building AI agents with Amazon Nova](agents.md))

When Using Amazon Nova within any RAG system there are two general approaches
+ **Using a retriever as a tool** (Recommended): You can define your retriever for use as a tool in the ToolParameter of the converse API or Invokemodel API. For example, you can define the Bedrock [Retrieve API](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html) or any other retriever as a "tool".
+ **Using Custom Instructions for RAG systems:** You can define your own custom instructions in order to build a custom RAG system.

**Using a retriever as a tool**

Define a tool that allows the model to invoke a retriever. The definition of the tool is a JSON schema that you pass in the `toolConfig` ([ToolConfiguration](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ToolConfiguration.html)) request parameter to the `Converse` operation.

```
{
    "tools": [
        {
            "toolSpec": {
                "name": "Retrieve information tool",
                "description": "This tool retrieves information from a custom database",
                "inputSchema": {
                    "json": {
                        "type": "object",
                        "properties": {
                            "query": {
                                "type": "string",
                                "description": "This is the description of the query parameter"
                            }
                        },
                        "required": [
                            "query"
                        ]
                    }
                }
            }
        }
    ]
}
```

After the tool is defined you can pass the tool configuration as a parameter in the converse API.

**How to interpret the response elements**

You will receive a response from the model as a JSON under the assistant "role" with the content type being "toolUse" or as a context type being "text" if the model chooses not to use the retriever tool. If the model chooses to use the retriever tool, the response will identify the tool (tool\$1name). Information about how the requested tool should be used is in the message that the model returns in the `output` ([ConverseOutput](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ConverseOutput.html)) field. Specifically, the `toolUse` ([ToolUseBlock](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ToolUseBlock.html)) field. You use the `toolUseId` field to identify the tool request in later calls.

```
{
    "output": {
        "message": {
            "role": "assistant",
            "content": [
                {
                    "toolUse": {
                        "toolUseId": "tooluse_1234567",
                        "name": "Retrieve information tool",
                        "input": {
                            "query": "Reformatted user query" #various arguments needed by the chosen tool
                        }
                    }
                }
            ]
        }
    },
    "stopReason": "tool_use"
}
```

From the `toolUse` field in the model response, you can use the `name` field to identify the name of the tool. Then call the implementation of the tool and pass the input parameters from the `input` field.

**How to input the retrieved content back into the Converse API**

To rerun the retrieved results back to Amazon Nova, you can now construct a Tool Block message that includes a `toolResult` ([ToolResultBlock](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ToolResultBlock.html)) content block within the user role. In the content block, include the response from the tool and the ID for the tool request that you got in the previous step.

```
{
    "role": "user",
    "content": [
        {
            "toolResult": {
                "toolUseId": "tooluse_1234567",
                "content": [
                    {
                        "json": {
                            "Text chunk 1": "retrieved information chunk 1",
                            "Text chunk 2": "retrieved information chunk 2"
                        }
                    }
                ],
                "status": "success | error"
            }
        }
    ]
}
```

The [toolResult](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_ToolResultBlock.html) can have "content" which can have "text", "JSON", and "image" (dependent on the model used). If an error occurs in the tool, such as a request for a nonexistent or wrong arguments, you can send error information to the model in the `toolResult` field. To indicate an error, specify `error` in the `status` field.

# Using Amazon Nova for Multimodal RAG


You can use multimodal RAG to search documents such as PDFs, images, or videos (available for Amazon Nova Lite and Amazon Nova Pro). With Amazon Nova multimodal understanding capabilities, you can build RAG systems with mixed data that contains both text and images. You can do this either through Amazon Bedrock Knowledge bases or through building a custom multimodal RAG system.

To create a multimodal RAG system:

1. Create a database of multimodal content.

1. Run Inference in multimodal RAG systems for Amazon Nova.

   1. Enable users to query the content

   1. Return the content back to Amazon Nova

   1. Enable Amazon Nova to respond to the original user query.

## Creating a custom multimodal RAG system with Amazon Nova
Creating a multimodal system

To create a database of multimodal content with Amazon Nova, you can use one of two common approaches. The accuracy of either approach is dependent on your specific application.

*Creating a vector database using multimodal embeddings.*

You can create a vector database of multimodal data by using an embeddings model such as [Titan multimodal embeddings](https://docs.aws.amazon.com/bedrock/latest/userguide/titan-multiemb-models.html). To do this, you first need to parse documents into text, tables, and images efficiently. Then, to create your vector database, pass the parsed content to the multimodal embeddings model of choice. We recommend to connect the embeddings to the portions of the document in their original modality so that the retriever can return the search results in the original content modality.

*Creating a vector database using text embeddings.*

To use a text embeddings model you can use Amazon Nova to convert images into text. Then you create a vector database by using a text embeddings model such as the [Titan Text Embeddings V2 model.](https://docs.aws.amazon.com/bedrock/latest/userguide/titan-embedding-models.html)

For documents such as slides and infographics, you can turn each part of the document into a text description and then create a vector database with the text descriptions. To create a text description use Amazon Nova through the [Converse API](https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference-call.html) with a prompt such as:

```
You are a story teller and narrator who will read an image and tell all the details of the image as a story.

Your job is to scan the entire image very carefully. Please start to scan the image from top to the bottom and retrieve all important parts of the image.  

In creating the story, you must first pay attention to all the details and extract relevant resources. Here are some important sources:
1. Please identify all the textual information within the image. Pay attention to text headers, sections/subsections anecdotes, and paragraphs. Especially, extract those pure-textual data not directly associated with graphs.
2. please make sure to describe every single graph you find in the image
3. please include all the statistics in the graph and describe each chart in the image in detail
4. please do NOT add any content that are not shown in the image in the description. It is critical to keep the description truthful
5. please do NOT use your own domain knowledge to infer and conclude concepts in the image. You are only a narrator and you must present every single data-point available in the image.

Please give me a detailed narrative of the image. While you pay attention to details, you MUST give the explanation in a clear English that is understandable by a general user.
```

Amazon Nova will then respond with a text description of the provided image. The text descriptions can then be sent to the text embeddings model to create the vector database.

Alternatively, for text intensive docs such as pdfs, it might be better to parse the images from the text (it depends on your specific data and application). To do this, you first need to parse documents into text, tables, and images efficiently. The resulting images can then be converted to text using a prompt like the one shown above. Then, the resulting text descriptions of the images and any other text can be sent to a text embeddings model to create a vector database. It is recommended to connect the embeddings to the portions of the document in their original modality so that the retriever can return the search results in the original content modality.

*Running inference in RAG systems for Amazon Nova*

After you've set up your vector database, you can now enable user queries to search the database, send the retrieved content back to Amazon Nova and then, using the retrieved content and the user query, enable Amazon Nova models to respond to the original user query.

To query the vector database with text or multimodal user queries, follow the same design choices that you would when performing RAG for text understanding and generation. You can either use [Amazon Nova with Amazon Bedrock Knowledge Bases](rag-br-knowledge.md) or build a [Custom RAG system with Amazon Nova and Converse API](rag-building.md).

When the retriever returns content back to the model, we recommend that you use the content in its original modality. So if the original input is an image, then return the image back to Amazon Nova even if you converted the images to text for the purposes of creating text embeddings. To return images more effectively, we recommended that you use this template to configure the retrieved content for use in the converse API:

```
doc_template = """Image {idx} : """
    messages = []
    for item in search_results:
            messages += [
                {
                    "text": doc_template.format(idx=item.idx)
                },
                {
                    "image": {
                        "format": "jpeg",
                        # image source is not actually used in offline inference 
                        # images input are provided to inferencer separately
                        "source": {
                            "bytes": BASE64_ENCODED_IMAGE  
                        }
                    }
                }
            ]
            
    messages.append({"text": question})
    
    
    system_prompt = """
    In this session, you are provided with a list of images and a user's question, your job is to answer the user's question using only information from the images. 

When give your answer, make sure to first quote the images (by mentioning image title or image ID) from which you can identify relevant information, then followed by your reasoning steps and answer.

If the images do not contain information that can answer the question, please state that you could not find an exact answer to the question. 

Remember to add citations to your response using markers like %[1]%, %[2]% and %[3]% for the corresponding images."""
```

Using the retrieved content and the user query in the Converse API, you can invoke the converse API and Amazon Nova will either generate a response or request an additional search. What happens depends on your instructions or whether the retrieved content effectively answered the user query.