

# Fully managed Retrieval Augmented Generation options on AWS
<a name="rag-fully-managed"></a>

To manage Retrieval Augmented Generation (RAG) workflows on AWS, you can use custom RAG pipelines or use some of the fully managed services capabilities that AWS offers. Because they include many of the core components of a RAG-based system, fully managed services can help you manage some of the undifferentiated heavy lifting. However, these services provide less opportunity for customization.

The fully managed AWS services use connectors to ingest data from external data sources, such as websites, Atlassian Confluence, or Microsoft SharePoint. The supported data sources vary by AWS service.

This section explores the following fully managed options for building RAG workflows on AWS:
+ [Knowledge bases for Amazon Bedrock](rag-fully-managed-bedrock.md)
+ [Amazon Q Business](rag-fully-managed-q-business.md)
+ [Amazon SageMaker AI Canvas](rag-fully-managed-sagemaker-canvas.md)

For more information about how to choose between these options, see [Choosing a Retrieval Augmented Generation option on AWS](choosing-option.md) in this guide.

# Knowledge bases for Amazon Bedrock
<a name="rag-fully-managed-bedrock"></a>

[Amazon Bedrock](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html) is a fully managed service that makes high-performing foundation models (FMs) from leading AI startups and Amazon available for your use through a unified API. [Knowledge bases](https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html) is an Amazon Bedrock capability that helps you implement the entire RAG workflow, from ingestion to retrieval and prompt augmentation. There is no need to build custom integrations to data sources or to manage data flows. Session context management is built in so that your generative AI application can readily support multi-turn conversations.

After you specify the location of your data, knowledge bases for Amazon Bedrock internally fetches the documents, chunks them into blocks of text, converts the text to embeddings, and then stores the embeddings in your choice of vector database. Amazon Bedrock manages and updates the embeddings, keeping the vector database in sync with the data. For more information about how knowledge bases work, see [How Amazon Bedrock knowledge bases work](https://docs.aws.amazon.com/bedrock/latest/userguide/kb-how-it-works.html).

If you add knowledge bases to an Amazon Bedrock agent, the agent identifies the appropriate knowledge base based on the user input. The agent retrieves the relevant information and adds the information to the input prompt. The updated prompt provides the model with more context information to generate a response. To improve transparency and minimize hallucinations, the information retrieved from the knowledge base is traceable to its source.



![\[The Amazon Bedrock agent retrieves information from the knowledge base and passes it to the LLM.\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/retrieval-augmented-generation-options/images/architecture-knowledge-base.png)


Amazon Bedrock supports the following two APIs for RAG:
+ [RetrieveAndGenerate](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_RetrieveAndGenerate.html) – You can use this API to query your knowledge base and generate responses from the information it retrieves. Internally, Amazon Bedrock converts the queries into embeddings, queries the knowledge base, augments the prompt with the search results as context information, and returns the LLM-generated response. Amazon Bedrock also manages the short-term memory of the conversation to provide more contextual results.
+ [Retrieve](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent-runtime_Retrieve.html) – You can use this API to query your knowledge base with information retrieved directly from the knowledge base. You can use the information returned from this API to process the retrieved text, evaluate their relevance, or develop a separate workflow for response generation. Internally, Amazon Bedrock converts the queries into embeddings, searches the knowledge base, and returns the relevant results. You can build additional workflows on top of the search results. For example, you can use the [https://python.langchain.com/docs/integrations/retrievers/bedrock/](https://python.langchain.com/docs/integrations/retrievers/bedrock/) `AmazonKnowledgeBasesRetriever` plugin to integrate RAG workflows into generative AI applications.

For sample architectural patterns and step-by-step instructions for using the APIs, see [Knowledge Bases now delivers fully managed RAG experience in Amazon Bedrock](https://aws.amazon.com/blogs/aws/knowledge-bases-now-delivers-fully-managed-rag-experience-in-amazon-bedrock/) (AWS blog post). For more information about how to use the `RetrieveAndGenerate` API to build a RAG workflow for an intelligent chat-based application, see [Build a contextual chatbot application using Amazon Bedrock Knowledge Bases](https://aws.amazon.com/blogs/machine-learning/build-a-contextual-chatbot-application-using-knowledge-bases-for-amazon-bedrock/) (AWS blog post).

## Data sources for knowledge bases
<a name="rag-fully-managed-bedrock-data-sources"></a>

You can connect your proprietary data to a knowledge base. After you've configured a data source connector, you can sync or keep your data up to date with your knowledge base and make your data available for querying. Amazon Bedrock knowledge bases support connections to the following data sources:
+ [Amazon Simple Storage Service (Amazon S3)](https://docs.aws.amazon.com/bedrock/latest/userguide/s3-data-source-connector.html) – You can connect an Amazon S3 bucket to an Amazon Bedrock knowledge base by using either the console or the API. The knowledge base ingests and indexes the files in the bucket. This type of data source supports the following features:
  + **Document metadata fields** – You can include a separate file to specify the metadata for the files in the Amazon S3 bucket. You can then use these metadata fields to filter and improve the relevancy of responses.
  + **Inclusion or exclusion filters** – You can include or exclude certain content when crawling.
  + **Incremental syncing** – The content changes are tracked, and only content that has changed since the last sync is crawled.
+ [https://docs.aws.amazon.com/bedrock/latest/userguide/confluence-data-source-connector.html](https://docs.aws.amazon.com/bedrock/latest/userguide/confluence-data-source-connector.html) – You can connect an Atlassian Confluence instance to an Amazon Bedrock knowledge base by using the console or the API. This type of data source supports the following features:
  + **Auto detection of main document fields** – The metadata fields are automatically detected and crawled. You can use these fields for filtering.
  + **Inclusion or exclusion content filters** – You can include or exclude certain content by using a prefix or a regular expression pattern on the space, page title, blog title, comment, attachment name, or extension.
  + **Incremental syncing** - The content changes are tracked, and only content that has changed since the last sync is crawled.
  + **OAuth 2.0 authentication, authentication with Confluence API token** – The authentication credentials are stored in AWS Secrets Manager.
+ [https://docs.aws.amazon.com/bedrock/latest/userguide/sharepoint-data-source-connector.html](https://docs.aws.amazon.com/bedrock/latest/userguide/sharepoint-data-source-connector.html) – You can connect a SharePoint instance to a knowledge base by using either the console or the API. This type of data source supports the following features:
  + **Auto detection of main document fields** – The metadata fields are automatically detected and crawled. You can use these fields for filtering.
  + **Inclusion or exclusion content filters** – You can include or exclude certain content by using a prefix or a regular expression pattern on the main page title, event name, and file name (including its extension).
  + **Incremental syncing** - The content changes are tracked, and only content that has changed since the last sync is crawled.
  + **OAuth 2.0 authentication** – The authentication credentials are stored in AWS Secrets Manager.
+ [https://docs.aws.amazon.com/bedrock/latest/userguide/salesforce-data-source-connector.html](https://docs.aws.amazon.com/bedrock/latest/userguide/salesforce-data-source-connector.html) – You can connect a Salesforce instance to a knowledge base by using either the console or the API. This type of data source supports the following features:   
  + **Auto detection of main document fields** – The metadata fields are automatically detected and crawled. You can use these fields for filtering.
  + **Inclusion or exclusion content filters** – You can include or exclude certain content by using a prefix or a regular expression pattern. For a list of content types that you can apply filters to, see *Inclusion/exclusion filters* in the [Amazon Bedrock documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/salesforce-data-source-connector.html#configuration-salesforce-connector).
  + **Incremental syncing** – The content changes are tracked, and only content that has changed since the last sync is crawled.
  + **OAuth 2.0 authentication** – The authentication credentials are stored in AWS Secrets Manager.
+ [Web Crawler](https://docs.aws.amazon.com/bedrock/latest/userguide/webcrawl-data-source-connector.html) – An Amazon Bedrock Web Crawler connects to and crawls the URLs that you provide. The following features are supported:
  + Select multiple URLs to crawl
  + Respect standard robots.txt directives, such as `Allow` and `Disallow`
  + Exclude URLs that match a pattern
  + Limit the rate of crawling
  + In Amazon CloudWatch, view the status of each URL crawled

For more information about the data sources that you can connect to your Amazon Bedrock knowledge base, see [Create a data source connector for your knowledge base](https://docs.aws.amazon.com/bedrock/latest/userguide/data-source-connectors.html).

## Vector databases for knowledge bases
<a name="rag-fully-managed-bedrock-vector-stores"></a>

When you set up a connection between the knowledge base and the data source, you must configure a vector database, also known as a *vector store*. A vector database is where Amazon Bedrock stores, updates, and manages the embeddings that represent your data. Each data source supports different types of vector database. To determine which vector database are available for your data source, see the [data source types](https://docs.aws.amazon.com/bedrock/latest/userguide/data-source-connectors.html).

If you prefer for Amazon Bedrock to automatically create a vector database in Amazon OpenSearch Serverless for you, you can choose this option when you create the knowledge base. However, you can also choose to set up your own vector database. If you set up your own vector database, see [Prerequisites for your own vector store for a knowledge base](https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base-setup.html). Each type of vector database has its own prerequisites.

Depending on your data source type, Amazon Bedrock knowledge bases support the following vector databases:
+ [Amazon OpenSearch Serverless](https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless-vector-search.html)
+ [Amazon Aurora PostgreSQL-Compatible Edition](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/AuroraPostgreSQL.VectorDB.html)
+ [https://docs.pinecone.io/docs/amazon-bedrock](https://docs.pinecone.io/docs/amazon-bedrock) (Pinecone documentation)
+ [https://docs.redis.com/latest/rc/cloud-integrations/aws-marketplace/aws-bedrock/](https://docs.redis.com/latest/rc/cloud-integrations/aws-marketplace/aws-bedrock/) (Redis documentation)
+ [https://dochub.mongodb.org/core/amazon-bedrock](https://dochub.mongodb.org/core/amazon-bedrock) (MongoDB documentation)

# Amazon Q Business
<a name="rag-fully-managed-q-business"></a>

[Amazon Q Business](https://docs.aws.amazon.com/amazonq/latest/qbusiness-ug/what-is.html) is a fully managed, generative-AI powered assistant that you can configure to answer questions, provide summaries, generate content, and complete tasks based on your enterprise data. It allows end users to receive immediate, permissions-aware responses from enterprise data sources with citations.

## Key features
<a name="rag-fully-managed-q-business-features"></a>

The following capabilities of Amazon Q Business can help you build a production-grade RAG-based generative AI application:
+ **Built-in connectors** – Amazon Q Business supports more than 40 types of connectors, such as connectors for Adobe Experience Manager (AEM), Salesforce, Jira, and Microsoft SharePoint. For a complete list, see [Supported connectors](https://docs.aws.amazon.com/amazonq/latest/qbusiness-ug/connectors-list.html). If you need a connector that is not supported, you can use [Amazon AppFlow](https://docs.aws.amazon.com/appflow/latest/userguide/what-is-appflow.html) to pull data from your data source into Amazon Simple Storage Service (Amazon S3) and then connect Amazon Q Business to the Amazon S3 bucket. For a complete list of data sources that Amazon AppFlow supports, see [Supported applications](https://docs.aws.amazon.com/appflow/latest/userguide/app-specific.html).
+ **Built-in indexing pipelines** – Amazon Q Business provides a built-in pipeline for indexing data in a vector database. You can use an AWS Lambda function to add preprocessing logic for your indexing pipeline.
+ **Index options** – You can create and provision a native index in Amazon Q Business, and you use an Amazon Q Business retriever to pull data from that index. Alternatively, you can use a preconfigured Amazon Kendra index as a retriever. For more information, see [Creating a retriever for an Amazon Q Business application](https://docs.aws.amazon.com/amazonq/latest/qbusiness-ug/select-retriever.html).
+ **Foundation models** – Amazon Q Business uses the foundation models that are supported in Amazon Bedrock. For a complete list, see [Supported foundation models in Amazon Bedrock](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html).
+ **Plugins** – Amazon Q Business provides the capability to use plugins to integrate with target systems, such as an automated way to summarize ticket information and ticket creation in Jira. Once configured, plugins can support read and write actions that can help you boost end user productivity. Amazon Q Business supports two types of plugins: [built-in plugins](https://docs.aws.amazon.com/amazonq/latest/qbusiness-ug/built-in-plugin.html) and [custom plugins](https://docs.aws.amazon.com/amazonq/latest/qbusiness-ug/custom-plugin.html).
+ **Guardrails** – Amazon Q Business supports global controls and topic-level controls. For example, these controls can detect personally identifiable information (PII), abuse, or sensitive information in prompts. For more information, see [Admin controls and guardrails in Amazon Q Business](https://docs.aws.amazon.com/amazonq/latest/qbusiness-ug/guardrails.html).
+ **Identity management** – With Amazon Q Business, you can manage users and their access to the RAG-based generative AI application. For more information, see [Identity and access management for Amazon Q Business](https://docs.aws.amazon.com/amazonq/latest/qbusiness-ug/security-iam.html). Also, Amazon Q Business connectors index access control list (ACL) information that's attached to a document along with the document itself. Then, Amazon Q Business stores the ACL information it indexes in the Amazon Q Business User Store to create user and group mappings and filter chat responses based on the end user's access to documents. For more information, see [Data source connector concepts](https://docs.aws.amazon.com/amazonq/latest/qbusiness-ug/connector-concepts.html).
+ **Document enrichment** – The document enrichment feature helps you control both **what** documents and document attributes are ingested into your index and also **how** they are ingested. This can be accomplished through two approaches:
  + **Configure basic operations **– Use basic operations to add, update, or delete document attributes from your data. For example, you can scrub PII data by choosing to delete any document attributes related to PII.
  + **Configure Lambda functions **– Use a preconfigured Lambda function to perform more customized, advanced document attribute manipulation logic to your data. For example, your enterprise data might be stored as scanned images. In that case, you can use a Lambda function to run optical character recognition (OCR) on the scanned documents to extract text from them. Then, each scanned document is treated as a text document during ingestion. Finally, during chat, Amazon Q will factor the textual data extracted from the scanned documents when it generates responses.

  When you implement your solution, you can choose to combine both document enrichment approaches. You can use basic operations to do a first parse of your data and then use a Lambda function for more complex operations. For more information, see [Document enrichment in Amazon Q Business](https://docs.aws.amazon.com/amazonq/latest/qbusiness-ug/custom-document-enrichment.html).
+ **Integration** – After you create your Amazon Q Business application, you can integrate it into other applications, such as Slack or Microsoft Teams. For example, see [Deploy a Slack gateway forAmazon Q Business](https://aws.amazon.com/blogs/machine-learning/deploy-a-slack-gateway-for-amazon-q-your-business-expert/) and [Deploy a Microsoft Teams gateway for Amazon Q Business](https://aws.amazon.com/blogs/machine-learning/deploy-a-microsoft-teams-gateway-for-amazon-q-your-business-expert/) (AWS blog posts).

## End-user customization
<a name="rag-fully-managed-q-business-customization"></a>

Amazon Q Business supports uploading documents that might not be stored in your organization's data sources and index. Uploaded documents are not stored. They are available for use only for the conversation in which the documents are uploaded. Amazon Q Business supports specific document types for upload. For more information, see [Upload files and chat in Amazon Q Business](https://docs.aws.amazon.com/amazonq/latest/qbusiness-ug/upload-chat-files.html).

Amazon Q Business includes a [filtering by document attribute](https://docs.aws.amazon.com/amazonq/latest/qbusiness-ug/metadata-filtering.html) feature. Both administrators and end users can use this feature. Administrators can customize and control chat responses for end users by using attributes. For example, if data source type is an attribute attached to your documents, you can specify that chat responses be generated only from a specific data source. Or, you can allow end users to restrict the scope of chat responses by using the attribute filters that you have selected.

End users can create lightweight, purpose-built [Amazon Q Apps](https://docs.aws.amazon.com/amazonq/latest/qbusiness-ug/purpose-built-qapps.html) within your broader Amazon Q Business application environment. Amazon Q apps allow task automation for a specific domain, such as a purpose-built app for marketing team.

# Amazon SageMaker AI Canvas
<a name="rag-fully-managed-sagemaker-canvas"></a>

[Amazon SageMaker AI Canvas](https://docs.aws.amazon.com/sagemaker/latest/dg/canvas.html) helps you use machine learning to generate predictions without needing to write any code. It provides a no-code visual interface that empowers you to prepare data, build, and deploy ML models, streamlining the end-to-end ML lifecycle in a unified environment. The complexities of data preparation, model development, bias detection, explainability, and monitoring are abstracted away behind an intuitive interface. Users don't need to be SageMaker AI or machine learning operations (MLOps) experts to develop, operationalize, and monitor models with SageMaker AI Canvas.

With SageMaker AI Canvas, the RAG functionality is provided through a no-code, document querying feature. You can enrich the chat experience in SageMaker AI Canvas by using an Amazon Kendra index as the underlying enterprise search. For more information, see [Extract information from documents with document querying](https://docs.aws.amazon.com/sagemaker/latest/dg/canvas-fm-chat-query.html).

Connecting SageMaker AI Canvas to the Amazon Kendra index requires a one-time setup. As part of the domain configuration, a cloud administrator can choose one or more Kendra indexes that the user can query when interacting with SageMaker Canvas. For instructions about how to enable the document querying feature, see [Getting started with using Amazon SageMaker AI Canvas](https://docs.aws.amazon.com/sagemaker/latest/dg/canvas-getting-started.html).

SageMaker AI Canvas manages the underlying communication between Amazon Kendra and the selected foundation model. For more information about the foundation models that SageMaker AI Canvas supports, see [Generative AI foundation models in SageMaker AI Canvas](https://docs.aws.amazon.com/sagemaker/latest/dg/canvas-fm-chat.html). The following diagram shows how the document querying feature works after the cloud administrator has connected SageMaker AI Canvas to an Amazon Kendra index.



![\[Workflow for the document querying feature in Amazon SageMaker AI Canvas.\]](http://docs.aws.amazon.com/prescriptive-guidance/latest/retrieval-augmented-generation-options/images/architecture-sagemaker-canvas-document-querying.png)


The diagram shows the following workflow:

1. The user starts a new chat in SageMaker AI Canvas, turns on **Query documents**, selects the target index, and then submits a question.

1. SageMaker AI Canvas uses the query to search the Amazon Kendra index for relevant data.

1. SageMaker AI Canvas retrieves the data and its sources from the Amazon Kendra index.

1. SageMaker AI Canvas updates the prompt to include the retrieved context from the Amazon Kendra index and submits the prompt to the foundation model.

1. The foundation model uses the original question and the retrieved context to generate an answer.

1. SageMaker AI Canvas provides the generated answer to the user. It includes references to the data sources, such as documents, that were used to generate the response.