# Working with datasets


Datasets are the foundation of your Quick Sight analytics, serving as the prepared and structured data sources that power your analyses and dashboards. Once you've created datasets from your data sources, you need to manage them effectively throughout their lifecycle to ensure reliable, secure, and collaborative analytics.

This section covers the complete dataset management workflow, from editing and versioning datasets to sharing them with team members and implementing security controls. You'll learn how to maintain dataset integrity while supporting collaborative analytics, track which analyses depend on your datasets, and implement both row-level and column-level security to protect sensitive information. Whether you're preparing datasets for team use, troubleshooting analysis issues, or implementing data governance policies, these topics provide the essential knowledge for effective dataset management in Quick Sight.

**Topics**
+ [

# Creating datasets
](creating-data-sets.md)
+ [

# Editing datasets
](edit-a-data-set.md)
+ [

# Reverting datasets back to previous published versions
](dataset-versioning.md)
+ [

# Duplicating datasets
](duplicate-a-data-set.md)
+ [

# Sharing datasets
](sharing-data-sets.md)
+ [

# Tracking dashboards and analyses that use a dataset
](track-analytics-that-use-dataset.md)
+ [

# Using dataset parameters in Amazon Quick
](dataset-parameters.md)
+ [

# Using row-level security in Amazon Quick
](row-level-security.md)
+ [

# Using column-level security to restrict access to a dataset
](restrict-access-to-a-data-set-using-column-level-security.md)
+ [

# Running queries as an IAM role in Amazon Quick
](datasource-run-as-role.md)
+ [

# Deleting datasets
](delete-a-data-set.md)
+ [

# Adding a dataset to an analysis
](adding-a-data-set-to-an-analysis.md)

# Creating datasets


 You can create datasets from new or existing data sources in Amazon Quick. You can use a variety of database data sources to provide data to Amazon Quick. This includes Amazon RDS instances and Amazon Redshift clusters. It also includes MariaDB, Microsoft SQL Server, MySQL, Oracle, and PostgreSQL instances in your organization, Amazon EC2, or similar environments. 

**Topics**
+ [

# Creating datasets using new data sources
](creating-data-sets-new.md)
+ [

# Creating a dataset using an existing data source
](create-a-data-set-existing.md)
+ [

# Creating a dataset using an existing dataset in Amazon Quick
](create-a-dataset-existing-dataset.md)

# Creating datasets using new data sources
From new data sources

When you create a dataset based on an AWS service like Amazon RDS, Amazon Redshift, or Amazon EC2, data transfer charges might apply when consuming data from that source. Those charges might also vary depending on whether that AWS resource is in the home AWS Region that you chose for your Amazon Quick account. For details on pricing, see the pricing page for the service in question.

When creating a new database dataset, you can select one table, join several tables, or create a SQL query to retrieve the data that you want. You can also change whether the dataset uses a direct query or instead stores data in [SPICE](spice.md).

**To create a new dataset**

1. To create a dataset, choose **New data set** on the **Data** page. You can then create a dataset based on an existing dataset or data source, or connect to a new data source and base the dataset on that.

1. Provide connection information to the data source:
   + For local text or Microsoft Excel files, you can simply identify the file location and upload the file.
   + For Amazon S3, provide a manifest identifying the files or buckets that you want to use, and also the import settings for the target files.
   + For Amazon Athena, all Athena databases for your AWS account are returned. No additional credentials are required.
   + For Salesforce, provide credentials to connect with.
   + For Amazon Redshift, Amazon RDS, Amazon EC2, or other database data sources, provide information about the server and database that host the data. Also provide valid credentials for that database instance.

# Creating a dataset from a database


The following procedures walk you through connecting to database data sources and creating datasets. To create datasets from AWS data sources that your Amazon Quick account autodiscovered, use [Creating a dataset from an autodiscovered Amazon Redshift cluster or Amazon RDS instance](#create-a-data-set-autodiscovered). To create datasets from any other database data sources, use [Creating a dataset using a database that's not autodiscovered](#create-a-data-set-database). 

## Creating a dataset from an autodiscovered Amazon Redshift cluster or Amazon RDS instance


Use the following procedure to create a connection to an autodiscovered AWS data source.

**To create a connection to an autodiscovered AWS data source**

1. Check [Data source quotas](data-source-limits.md) to make sure that your target table or query doesn't exceed data source quotas.

1. Confirm that the database credentials you plan to use have appropriate permissions as described in [Required permissions](required-permissions.md). 

1. Make sure that you have configured the cluster or instance for Amazon Quick access by following the instructions in [Network and database configuration requirements](configure-access.md).

1. On the Amazon Quick start page, choose **Data**.

1. Choose **Create ** then choose **New dataset**.

1. Choose either the **RDS** or the **Redshift Auto-discovered** icon, depending on the AWS service that you want to connect to.

1. Enter the connection information for the data source, as follows:
   + For **Data source name**, enter a name for the data source.
   + For **Instance ID**, choose the name of the instance or cluster that you want to connect to.
   + **Database name** shows the default database for the **Instance ID** cluster or instance. To use a different database on that cluster or instance, enter its name.
   + For **UserName**, enter the user name of a user account that has permissions to do the following: 
     + Access the target database. 
     + Read (perform a `SELECT` statement on) any tables in that database that you want to use.
   + For **Password**, enter the password for the account that you entered.

1. Choose **Validate connection** to verify your connection information is correct.

1. If the connection validates, choose **Create data source**. If not, correct the connection information and try validating again.
**Note**  
Amazon Quick automatically secures connections to Amazon RDS instances and Amazon Redshift clusters by using Secure Sockets Layer (SSL). You don't need to do anything to enable this.

1. Choose one of the following:
   + **Custom SQL**

     On the next screen, you can choose to write a query with the **Use custom SQL** option. Doing this opens a screen named **Enter custom SQL query**, where you can enter a name for your query, and then enter the SQL. For best results, compose the query in a SQL editor, and then paste it into this window. After you name and enter the query, you can choose **Edit/Preview data** or **Confirm query**. Choose **Edit/Preview data** to immediately go to data preparation. Choose **Confirm query** to validate the SQL and make sure that there are no errors.
   + **Choose tables**

     To connect to specific tables, for **Schema: contain sets of tables**, choose **Select** and then choose a schema. In some cases where there is only a single schema in the database, that schema is automatically chosen, and the schema selection option isn't displayed.

     To prepare the data before creating an analysis, choose **Edit/Preview data** to open data preparation. Use this option if you want to join to more tables.

     Otherwise, after choosing a table, choose **Select**.

1. Choose one of the following options:
   + Prepare the data before creating an analysis. To do this, choose **Edit/Preview data** to open data preparation for the selected table. For more information about data preparation, see [Preparing dataset examples](preparing-data-sets.md).
   + Create a dataset and analysis using the table data as-is and to import the dataset data into SPICE for improved performance (recommended). To do this, check the table size and the SPICE indicator to see if you have enough capacity.

     If you have enough SPICE capacity, choose **Import to SPICE for quicker analytics**, and then create an analysis by choosing **Visualize**.
**Note**  
If you want to use SPICE and you don't have enough space, choose **Edit/Preview data**. In data preparation, you can remove fields from the dataset to decrease its size. You can also apply a filter or write a SQL query that reduces the number of rows or columns returned. For more information about data preparation, see [Preparing dataset examples](preparing-data-sets.md).
   + To create a dataset and an analysis using the table data as-is, and to have the data queried directly from the database, choose the **Directly query your data** option. Then create an analysis by choosing **Visualize**.

## Creating a dataset using a database that's not autodiscovered


Use the following procedure to create a connection to any database other than an autodiscovered Amazon Redshift cluster or Amazon RDS instance. Such databases include Amazon Redshift clusters and Amazon RDS instances that are in a different AWS Region or are associated with a different AWS account. They also include MariaDB, Microsoft SQL Server, MySQL, Oracle, and PostgreSQL instances that are on-premises, in Amazon EC2, or in some other accessible environment.

**To create a connection to a database that isn't an autodiscovered Amazon Redshift cluster or RDS instance**

1. Check [Data source quotas](data-source-limits.md) to make sure that your target table or query doesn't exceed data source quotas.

1. Confirm that the database credentials that you plan to use have appropriate permissions as described in [Required permissions](required-permissions.md). 

1. Make sure that you have configured the cluster or instance for Amazon Quick access by following the instructions in [Network and database configuration requirements](configure-access.md).

1. On the Amazon Quick start page, choose **Manage data**.

1. Choose **Create ** then choose **New data set**.

1. Choose the **Redshift Manual connect** icon if you want to connect to an Amazon Redshift cluster in another AWS Region or associated with a different AWS account. Or choose the appropriate database management system icon to connect to an instance of Amazon Aurora, MariaDB, Microsoft SQL Server, MySQL, Oracle, or PostgreSQL.

1. Enter the connection information for the data source, as follows:
   + For **Data source name**, enter a name for the data source.
   + For **Database server**, enter one of the following values:
     + For an Amazon Redshift cluster or Amazon RDS instance, enter the endpoint of the cluster or instance without the port number. For example, if the endpoint value is `clustername.1234abcd.us-west-2.redshift.amazonaws.com:1234`, then enter `clustername.1234abcd.us-west-2.redshift.amazonaws.com`. You can get the endpoint value from the **Endpoint** field on the cluster or instance detail page in the AWS console.
     + For an Amazon EC2 instance of MariaDB, Microsoft SQL Server, MySQL, Oracle, or PostgreSQL, enter the public DNS address. You can get the public DNS value from the **Public DNS** field on the instance detail pane in the Amazon EC2 console.
     + For a non-Amazon EC2 instance of MariaDB, Microsoft SQL Server, MySQL, Oracle, or PostgreSQL, enter the hostname or public IP address of the database server. If you are using Secure Sockets Layer (SSL) for a secured connection (recommended), you likely need to provide the hostname to match the information required by the SSL certificate. For a list of accepted certificates see [Amazon Quick SSL and CA certificates](configure-access.md#ca-certificates).
   + For **Port**, enter the port that the cluster or instance uses for connections.
   + For **Database name**, enter the name of the database that you want to use.
   + For **UserName**, enter the user name of a user account that has permissions to do the following: 
     + Access the target database. 
     + Read (perform a `SELECT` statement on) any tables in that database that you want to use.
   + For **Password**, enter the password associated with the account you entered.

1. (Optional) If you are connecting to anything other than an Amazon Redshift cluster and you *don't* want a secured connection, make sure that **Enable SSL** is clear. *We strongly recommend leaving this checked*, because an unsecured connection can be open to tampering. 

   For more information on how the target instance uses SSL to secure connections, see the documentation for the target database management system. Amazon Quick doesn't accept self-signed SSL certificates as valid. For a list of accepted certificates, see [Amazon Quick SSL and CA certificates](configure-access.md#ca-certificates).

   Amazon Quick automatically secures connections to Amazon Redshift clusters by using SSL. You don't need to do anything to enable this.

   Some databases, such as Presto and Apache Spark, must meet additional requirements before Amazon Quick can connect. For more information, see [Creating a data source using Presto](create-a-data-source-presto.md), or [Creating a data source using Apache Spark](create-a-data-source-spark.md).

1. (Optional) Choose **Validate connection** to verify your connection information is correct.

1. If the connection validates, choose **Create data source**. If not, correct the connection information and try validating again.

1. Choose one of the following:
   + **Custom SQL**

     On the next screen, you can choose to write a query with the **Use custom SQL** option. Doing this opens a screen named **Enter custom SQL query**, where you can enter a name for your query, and then enter the SQL. For best results, compose the query in a SQL editor, and then paste it into this window. After you name and enter the query, you can choose **Edit/Preview data** or **Confirm query**. Choose **Edit/Preview data** to immediately go to data preparation. Choose **Confirm query** to validate the SQL and make sure that there are no errors.
   + **Choose tables**

     To connect to specific tables, for **Schema: contain sets of tables**, choose **Select** and then choose a schema. In some cases where there is only a single schema in the database, that schema is automatically chosen, and the schema selection option isn't displayed.

     To prepare the data before creating an analysis, choose **Edit/Preview data** to open data preparation. Use this option if you want to join to more tables.

     Otherwise, after choosing a table, choose **Select**.

1. Choose one of the following options:
   + Prepare the data before creating an analysis. To do this, choose **Edit/Preview data** to open data preparation for the selected table. For more information about data preparation, see [Preparing dataset examples](preparing-data-sets.md).
   + Create a dataset and an analysis using the table data as-is and import the dataset data into SPICE for improved performance (recommended). To do this, check the table size and the SPICE indicator to see if you have enough space.

     If you have enough SPICE capacity, choose **Import to SPICE for quicker analytics**, and then create an analysis by choosing **Visualize**.
**Note**  
If you want to use SPICE and you don't have enough space, choose **Edit/Preview data**. In data preparation, you can remove fields from the dataset to decrease its size. You can also apply a filter or write a SQL query that reduces the number of rows or columns returned. For more information about data preparation, see [Preparing dataset examples](preparing-data-sets.md).
   + Create a dataset and an analysis using the table data as-is and have the data queried directly from the database. To do this, choose the **Directly query your data** option. Then create an analysis by choosing **Visualize**.

# Creating a dataset using an existing data source
From existing data sources

After you make an initial connection to a Salesforce, AWS data store, or other database data source, Amazon Quick saves the connection information. It adds the data source to the **FROM EXISTING DATA SOURCES** section of the **Create a Data Set** page. You can use these existing data sources to create new datasets without respecifying connection information.

## Creating a dataset using an existing Amazon S3 data source


Use the following procedure to create a dataset using an existing Amazon S3 data source.

**To create a dataset using an existing S3 data source**

1. On the Amazon Quick start page, choose **Data**.

1. Choose **Create** then choose **New dataset**.

1. Choose the Amazon S3 data source to use.

1. To prepare the data before creating the dataset, choose **Edit/Preview data**. To create an analysis using the data as-is, choose **Visualize**.

## Creating a dataset using an existing Amazon Athena data source


To create a dataset using an existing Amazon Athena data source, use the following procedure.

**To create a dataset from an existing Athena connection profile**

1. On the Amazon Quick start page, choose **Data**.

1. Choose **Create ** then choose **New data set**.

   Choose the connection profile icon for the existing data source that you want to use. Connection profiles are labeled with the data source icon and the name provided by the person who created the connection.

1. Choose **Create data set**.

   Amazon Quick creates a connection profile for this data source based only on the Athena workgroup. The database and table aren't saved. 

1. On the **Choose your table** screen, do one of the following:
   + To write a SQL query, choose **Use custom SQL**.
   + To choose a database and table, first select your database from the **Database** list. Next, choose a table from the list that appears for your database.

## Create a dataset using an existing Salesforce data source


Use the following procedure to create a dataset using an existing Salesforce data source.

**To create a dataset using an existing Salesforce data source**

1. On the Amazon Quick start page, choose **Data**.

1. Choose **Create ** then choose **New data set**.

1. Choose the Salesforce data source to use.

1. Choose **Create Data Set**.

1. Choose one of the following:
   + **Custom SQL**

     On the next screen, you can choose to write a query with the **Use custom SQL** option. Doing this opens a screen named **Enter custom SQL query**, where you can enter a name for your query, and then enter the SQL. For best results, compose the query in a SQL editor, and then paste it into this window. After you name and enter the query, you can choose **Edit/Preview data** or **Confirm query**. Choose **Edit/Preview data** to immediately go to data preparation. Choose **Confirm query** to validate the SQL and make sure that there are no errors.
   + **Choose tables**

     To connect to specific tables, for **Data elements: contain your data**, choose **Select** and then choose either **REPORT** or **OBJECT**. 

     To prepare the data before creating an analysis, choose **Edit/Preview data** to open data preparation. Use this option if you want to join to more tables.

     Otherwise, after choosing a table, choose **Select**.

1. On the next screen, choose one of the following options:
   + To create a dataset and an analysis using the data as-is, choose **Visualize**.
**Note**  
If you don't have enough [SPICE](spice.md) capacity, choose **Edit/Preview data**. In data preparation, you can remove fields from the dataset to decrease its size or apply a filter that reduces the number of rows returned. For more information about data preparation, see [Preparing dataset examples](preparing-data-sets.md).
   + To prepare the data before creating an analysis, choose **Edit/Preview data** to open data preparation for the selected report or object. For more information about data preparation, see [Preparing dataset examples](preparing-data-sets.md).

## Creating a dataset using an existing database data source


Use the following procedure to create a dataset using an existing database data source.

**To create a dataset using an existing database data source**

1. On the Amazon Quick start page, choose **Data**.

1. Choose **Create** then choose **New data set**.

1. Choose the database data source to use, and then choose **Create Data Set**.

1. Choose one of the following:
   + **Custom SQL**

     On the next screen, you can choose to write a query with the **Use custom SQL** option. Doing this opens a screen named **Enter custom SQL query**, where you can enter a name for your query, and then enter the SQL. For best results, compose the query in a SQL editor, and then paste it into this window. After you name and enter the query, you can choose **Edit/Preview data** or **Confirm query**. Choose **Edit/Preview data** to immediately go to data preparation. Choose **Confirm query** to validate the SQL and make sure that there are no errors.
   + **Choose tables**

     To connect to specific tables, for **Schema: contain sets of tables**, choose **Select** and then choose a schema. In some cases where there is only a single schema in the database, that schema is automatically chosen, and the schema selection option isn't displayed.

     To prepare the data before creating an analysis, choose **Edit/Preview data** to open data preparation. Use this option if you want to join to more tables.

     Otherwise, after choosing a table, choose **Select**.

1. Choose one of the following options:
   + Prepare the data before creating an analysis. To do this, choose **Edit/Preview data** to open data preparation for the selected table. For more information about data preparation, see [Preparing dataset examples](preparing-data-sets.md).
   + Create a dataset and an analysis using the table data as-is and import the dataset data into [SPICE](spice.md) for improved performance (recommended). To do this, check the SPICE indicator to see if you have enough space.

     If you have enough SPICE capacity, choose **Import to SPICE for quicker analytics**, and then create an analysis by choosing **Visualize**.
**Note**  
If you want to use SPICE and you don't have enough space, choose **Edit/Preview data**. In data preparation, you can remove fields from the dataset to decrease its size. You can also apply a filter or write a SQL query that reduces the number of rows or columns returned. For more information about data preparation, see [Preparing dataset examples](preparing-data-sets.md).
   + Create a dataset and an analysis using the table data as-is and have the data queried directly from the database. To do this, choose the **Directly query your data** option. Then create an analysis by choosing **Visualize**.

# Creating a dataset using an existing dataset in Amazon Quick
From existing datasets

After you create a dataset in Amazon Quick, you can create additional datasets using it as a source. When you do this, any data preparation that the parent dataset contains, such as any joins or calculated fields, is kept. You can add additional preparation to the data in the new child datasets, such as joining new data and filtering data. You can also set up your own data refresh schedule for the child dataset and track the dashboards and analyses that use it.

Child datasets that are created using a dataset with RLS rules active as a source inherit the parent dataset's RLS rules. Users who are creating a child dataset from a larger parent dataset can only see the data that they have access to in the parent dataset. Then, you can add more RLS rules to the new child dataset in addition to the inherited RLS rules to further manage who can access the data that is in the new dataset. You can only create child datasets from datasets with RLS rules active in Direct Query.

Creating datasets from existing Quick datasets has the following advantages:
+ **Central management of datasets** – Data engineers can easily scale to the needs of multiple teams within their organization. To do this, they can develop and maintain a few general-purpose datasets that describe the organization's main data models.
+ **Reduction of data source management** – Business analysts (BAs) often spend lots of time and effort requesting access to databases, managing database credentials, finding the right tables, and managing Quick data refresh schedules. Building new datasets from existing datasets means that BAs don't have to start from scratch with raw data from databases. They can start with curated data.
+ **Predefined key metrics** – By creating datasets from existing datasets, data engineers can centrally define and maintain critical data definitions across their company's many organizations. Examples might be sales growth and net marginal return. With this feature, data engineers can also distribute changes to those definitions. This approach means that their business analysts can get started with visualizing the right data more quickly and reliably.
+ **Flexibility to customize data** – By creating datasets from existing datasets, business analysts get more flexibility to customize datasets for their own business needs. They can avoid worry about disrupting data for other teams.

For example, let's say that you're part of an ecommerce central team of five data engineers. You and your team has access to sales, orders, cancellations, and returns data in a database. You have created a Quick dataset by joining 18 other dimension tables through a schema. A key metric that your team has created is the calculated field, order product sales (OPS). Its definition is: OPS = product quantity x price.

Your team serves over 100 business analysts across 10 different teams in eight countries. These are the Coupons team, the Outbound Marketing team, the Mobile Platform team, and the Recommendations team. All of these teams use the OPS metric as a base to analyze their own business line.

Rather than manually creating and maintaining hundreds of unconnected datasets, your team reuses datasets to create multiple levels of datasets for teams across the organization. Doing this centralizes data management and allows each team to customize the data for their own needs. At the same time, this syncs updates to the data, such as updates to metric definitions, and maintains row-level and column-level security. For example, individual teams in your organization can use the centralized datasets. They can then combine them with the data specific to their team to create new datasets and build analyses on top of them.

Along with using the key OPS metric, other teams in your organization can reuse column metadata from the centralized datasets that you created. For example, the Data Engineering team can define metadata, such as *name*, *description*, *data type*, and *folders*, in a centralized dataset. All subsequent teams can use it.

**Note**  
Amazon Quick supports creating up to two additional levels of datasets from a single dataset.  
For example, from a parent dataset, you can create a child dataset and then a grandchild dataset for a total of three dataset levels.

## Creating a dataset from an existing dataset


Use the following procedure to create a dataset from an existing dataset.

**To create a dataset from an existing dataset**

1. From the Quick start page, choose **Data** in the pane at left.

1. Choose **Create** then choose the dataset that you want to use to create a new dataset.

1. On the page that opens for that dataset, choose the drop-down menu for **Use in analysis**, and then choose **Use in dataset**.

   The data preparation page opens and preloads everything from the parent dataset, including calculated fields, joins, and security settings.

1. On the data preparation page that opens, for **Query mode** at bottom left, choose how you want the dataset to pull in changes and updates from the original, parent dataset. You can choose the following options: 
   + **Direct query** – This is the default query mode. If you choose this option, the data for this dataset automatically refreshes when you open an associated dataset, analysis, or dashboard. However, the following limitations apply:
     + If the parent dataset allows direct querying, you can use direct query mode in the child dataset.
     + If you have multiple parent datasets in a join, you can choose direct query mode for your child dataset only if all the parents are from the same underlying data source. For example, the same Amazon Redshift connection.
     + Direct query is supported for a single SPICE parent dataset. It is not supported for multiple SPICE parent datasets in a join.
   + **SPICE** – If you choose this option, you can set up a schedule for your new dataset to sync with the parent dataset. For more information about creating SPICE refresh schedules for datasets, see [Refreshing SPICE data](refreshing-imported-data.md).

1. (Optional) Prepare your data for analysis. For more information about preparing data, see [Preparing data in Amazon Quick Sight](preparing-data.md).

1. (Optional) Set up row-level or column-level security (RLS/CLS) to restrict access to the dataset. For more information about setting up RLS, see [Using row-level security with user-based rules to restrict access to a datasetUsing user-based rules](restrict-access-to-a-data-set-using-row-level-security.md). For more information about setting up CLS, see [Using column-level security to restrict access to a dataset](restrict-access-to-a-data-set-using-column-level-security.md).
**Note**  
You can set up RLS/CLS on child datasets only. RLS/CLS on parent datasets is not supported.

1. When you're finished, choose **Save & publish **to save your changes and publish the new child dataset. Or choose **Publish & visualize** to publish the new child dataset and begin visualizing your data. 

# Restricting others from creating new datasets from your dataset


When you create a dataset in Amazon Quick, you can prevent others from using it as a source for other datasets. You can specify if others can use it to create any datasets at all. Or you can specify the type of datasets others can or can't create from your dataset, such as direct query datasets or SPICE datasets.

Use the following procedure to learn how to restrict others from creating new datasets from your dataset.

**To restrict others from creating new datasets from your dataset**

1. From the Quick start page, choose **Data** in the pane at left.

1. Choose **Create** then choose the dataset that you want to restrict creating new datasets from.

1. On the page that opens for that dataset, choose **Edit dataset**.

1. On the data preparation page that opens, choose **Manage** at upper right, and then choose **Properties**.

1. In the **Dataset properties** pane that opens, choose from the following options:
   + To restrict anyone from creating any type of new datasets from this dataset, turn off **Allow new datasets to be created from this one**.

     The toggle is blue when creating new datasets is allowed. It's gray when creating new datasets isn't allowed.
   + To restrict others from creating direct query datasets, clear **Allow direct query**.
   + To restrict others from creating SPICE copies of your dataset, clear **Allow SPICE copies**.

     For more information about SPICE datasets, see [Importing data into SPICE](spice.md).

1. Close the pane.

# Editing datasets


You can edit an existing dataset to perform data preparation. For more information about Quick Sight data preparation functionality, see [Preparing data in Amazon Quick Sight](preparing-data.md).

You can open a dataset for editing from the **Datasets** page, or from the analysis page. Editing a dataset from either location modifies the dataset for all analyses that use it.

## Things to consider when editing datasets


In two situations, changes to a dataset might cause concern. One is if you deliberately edit the dataset. The other is if your data source has changed so much that it affects the analyses based on it. 

**Important**  
Analyses that are in production usage should be protected so they continue to function correctly. 

We recommend the following when you're dealing with data changes:
+ Document your data sources and datasets, and the visuals that rely upon them. Documentation should include screenshots, fields used, placement in field wells, filters, sorts, calculations, colors, formatting, and so on. Record everything that you need to recreate the visual. You can also track which Quick Sight resources use a dataset in the dataset management options. For more information, see [Tracking dashboards and analyses that use a dataset](track-analytics-that-use-dataset.md).
+ When you edit a dataset, try not to make changes that might break existing visuals. For example, don't remove columns that are being used in a visual. If you must remove a column, create a calculated column in its place. The replacement column should have the same name and data type as the original. 
+ If your data source or dataset changes in your source database, adapt your visual to accommodate the change, as described previously. Or you can try to adapt the source database. For example, you might create a view of the source table (document). Then if the table changes, you can adjust the view to include or exclude columns (attributes), change data types, fill null values, and so on. Or, in another circumstance, if your dataset is based on a slow SQL query, you might create a table to hold the results of the query. 

  If you can't sufficiently adapt the source of the data, recreate the visuals based on your documentation of the analysis.
+ If you no longer have access to a data source, your analyses based on that source are empty. The visuals that you created still exist, but they can't display until they have some data to show. This result can happen if permissions are changed by your administrator.
+ If you remove the dataset a visual is based on, you might need to recreate it from your documentation. You can edit the visual and select a new dataset to use with it. If you need to consistently use a new file to replace an older one, store your data in a location that is consistently available. For example, you might store your .csv file in Amazon S3 and create an S3 dataset to use for your visuals. For more information on access files stored in S3, see [Creating a dataset using Amazon S3 files](create-a-data-set-s3.md). 

  Or you can import the data into a table, and base your visual on a query. This way, the data structures don't change, even if the data contained in them changes.
+ To centralize data management, consider creating general, multiple-purpose datasets that others can use to create their own datasets from. For more information, see [Creating a dataset using an existing dataset in Amazon Quick](create-a-dataset-existing-dataset.md).

## Editing a dataset from the Datasets page


1. From the Quick start page, choose **Data** at left.

1. On the **Data** page that opens, choose the dataset that you want to edit, and then choose **Edit dataset** at upper right.

   The data preparation page opens. For more information about the types of edits you can make to datasets, see [Preparing data in Amazon Quick Sight](preparing-data.md).

## Editing a dataset in an analysis


Use the following procedure to edit a dataset from the analysis page.

**To edit a dataset from the analysis page**

1. In your analysis, choose the pencil icon at the top of the **Fields list** pane.

1. In **Data sets in this analysis** page that opens, choose the three dots at right of the dataset that you want to edit, and then choose **Edit**.

   The dataset opens in the data preparation page.For more information about the types of edits you can make to datasets, see [Preparing data in Amazon Quick Sight](preparing-data.md).

# Reverting datasets back to previous published versions
Reverting datasets

When you save and publish changes to a dataset in Amazon Quick Sight, a new version of the dataset is created. At any time, you can see a list of all the previous published versions of that dataset. You can also preview a specific version in that history, or even revert the dataset back to a previous version, if needed.

The following limitations apply to dataset versioning:
+ Only the most recent 1,000 versions of a dataset are shown in the publishing history, and are available for versioning.
+ After you exceed 1,000 published versions, the oldest versions are automatically removed from the publishing history, and the dataset can no longer be reverted back to them.

Use the following procedure to revert a dataset to a previous published version.

**To revert a dataset to a previous published version**

1. From the Quick start page, choose **Data**.

1. On the **Data** page, choose a dataset, and then choose **Edit dataset** at upper right.

   For more information about editing datasets, see [Editing datasets](edit-a-data-set.md).

1. On the dataset preparation page that opens, choose the **Manage** icon in the blue toolbar at upper right, and then choose **Publishing history**.

   A list of previous published versions appears at right.

1. In the **Publishing history** pane, find the version that you want and choose **Revert**.

   To preview the version before reverting, choose **Preview**.

   The dataset is reverted and a confirmation message appears. The **Publishing history** pane also updates to show the active version of the dataset.

## Troubleshooting reverting versions
Troubleshooting

Sometimes, the dataset can't be reverted to a specific version for one the following reasons:
+ The dataset uses one or more data sources that were deleted.

  If this error occurs, you can't revert the dataset to a previous version.
+ Reverting would make a calculated field not valid.

  If this error occurs, you can edit or remove the calculated field, and then save the dataset. Doing this creates a new version of the dataset.
+ One or more columns are missing in the data source.

  If this error occurs, Quick Sight shows the latest schema from the data source in the preview to reconcile differences between versions. Any calculated field, field name, field type, and filter changes shown in the schema preview are from the version that you want to revert to. You can save this reconciled schema as a new version of the dataset. Or you can return to the active (latest) version by choosing **Preview** on the top (latest) version in the publishing history.

# Duplicating datasets


You can duplicate an existing dataset to save a copy of it with a new name. The new dataset is a completely separate copy. 

The **Duplicate dataset** option is available if both of the following are true: you own the dataset and you have permission to the data source.

**To duplicate a dataset**

1. From the Quick start page, choose **Data** at left.

1. Choose the dataset that you want to duplicate.

1. On the dataset details page that opens, choose the drop-down for **Edit dataset**, and then choose **Duplicate**.

1. On the Duplicate dataset page that opens, give the duplicated dataset a name, and then choose **Duplicate**.

   The duplicated dataset details page opens. From this page, you can edit the dataset, set up a refresh schedule, and more.

# Sharing datasets


You can give other Quick Sight users and groups access to a dataset by sharing it with them. Then they can create analyses from it. If you make them co-owners, they can also refresh, edit, delete, or reshare the dataset. 

## Sharing a dataset


If you have owner permissions on a dataset, use the following procedure to share it.

**To share a dataset**

1. From the Quick start page, choose **Data** at left.

1. On the **Data** page, choose the dataset that you want to share.

1. On the dataset details page that opens, choose the **Permissions** tab, and then choose **Add users & groups**.

1. Enter the user or group that you want to share this dataset with, and then choose **Add**. You can only invite users who belong to the same Quick account.

   Repeat this step until you have entered information for everyone you want to share the dataset with.

1. For the **Permissions** column, choose a role for each user or group to give them permissions on the dataset.

   Choose **Viewer** to allow the user to create analyses and datasets from the dataset. Choose **Owner** to allow the user to do that and also refresh, edit, delete, and reshare the dataset.

   Users receive emails with a link to the dataset. Groups don't receive invitation emails.

# Viewing and editing the permissions of users that a dataset is shared with


If you have owner permissions on a dataset, you can use the following procedure to view, edit, or change user access to it. 

**To view, edit, or change user access to a dataset if you have owner permissions for it**

1. From the Quick start page, choose **Data** at left.

1. On the **Data** page, choose the dataset that you want to share.

1. On the dataset details page that opens, choose the **Permissions** tab.

   A list of all users and groups with access to the dataset is displayed.

1. (Optional) To change permission roles for a user or group, choose the drop-down menu in the **Permissions** column for the user or group. Then choose either **Viewer** or **Owner**.

# Revoking access to a dataset


If you have owner permissions on a dataset, you can use the following procedure to revoke user access to a dataset.

**To revoke user access to a dataset if you have owner permissions for it**

1. From the Quick start page, choose **Data** at left.

1. On the **Data** page, choose the dataset that you want to share.

1. On the dataset details page that opens, choose the **Permissions** tab.

   A list of all users and groups with access to the dataset is displayed.

1. In the **Actions** column for the user or group, choose **Revoke access**.

# Tracking dashboards and analyses that use a dataset
Tracking dataset assets

When you create a dataset in Quick Sight, you can track which dashboards and analyses use that dataset. This approach is useful when you want to see which resources will be affected when you make changes to a dataset, or want to delete a dataset. 

Use the following procedure to see which dashboards and analyses use a dataset.

**To track resources that use a dataset**

1. From the Quick start page, choose **Data** in the pane at left.

1. On the **Data** page, choose the dataset that you want to track resources for.

1. In the page that opens for that dataset, choose **Edit dataset**.

1. In the data preparation page that opens, choose **Manage** at upper right, and then choose **Usage**.

1. The dashboards and analyses that use the dataset are listed in the pane that opens.

# Using dataset parameters in Amazon Quick
Dataset parameters

In Amazon Quick, authors can use dataset parameters in direct query to dynamically customize their datasets and apply reusable logic to their datasets. A *dataset parameter* is a parameter created at the dataset level. It's consumed by an analysis parameter through controls, calculated fields, filters, actions, URLs, titles, and descriptions. For more information on analysis parameters, see [Parameters in Amazon Quick](parameters-in-quicksight.md). The following list describes three actions that can be performed with dataset parameters:
+  **Custom SQL in direct query** – Dataset owners can insert dataset parameters into the custom SQL of a direct query dataset. When these parameters are applied to a filter control in a Quick analysis, users can filter their custom data faster and more efficiently.
+ **Repeatable variables** – Static values that appear in multiple locations in the dataset page can be modified in one action using custom dataset parameters.
+ **Move calculated fields to datasets** – Quick authors can copy calculated fields with parameters in an analysis and migrate them to the dataset level. This protects calculated fields at the analysis level from being accidentally modified and calculated fields be shared across multiple analyses.

In some situations, dataset parameters improve filter control performance for direct query datasets that require complex custom SQL and simplify business logic at the dataset level.

**Topics**
+ [

## Dataset parameter limitations
](#dataset-parameters-limitations)
+ [

# Creating dataset parameters in Amazon Quick
](dataset-parameters-SQL.md)
+ [

# Inserting dataset parameters into custom SQL
](dataset-parameters-insert-parameter.md)
+ [

# Adding dataset parameters to calculated fields
](dataset-parameters-calculated-fields.md)
+ [

# Adding dataset parameters to filters
](dataset-parameters-dataset-filters.md)
+ [

# Using dataset parameters in Quick analyses
](dataset-parameters-analysis.md)
+ [

# Advanced use cases of dataset parameters
](dataset-parameters-advanced-options.md)

## Dataset parameter limitations


This section covers known limitations that you might encounter when working with dataset parameters in Amazon Quick.
+ When dashboard readers schedule emailed reports, selected controls don't propagate to the dataset parameters that are included in the report that's attached to the email. Instead, the default values of the parameters are used.
+ Dataset parameters can't be inserted into custom SQL of datasets stored in SPICE.
+ Dynamic defaults can only be configured on the analysis page of the analysis that is using the dataset. You can't configure a dynamic default at the dataset level.
+ The **Select all** option is not supported on multivalue controls of analysis parameters that are mapped to dataset parameters.
+ Cascading controls are not supported for dataset parameters.
+ Dataset parameters can only be used by dataset filters when the dataset is using direct query.
+ In a custom SQL query, only 128 dataset parameters can be used.

# Creating dataset parameters in Amazon Quick
Creating dataset parameters

Use the following procedures to get started using dataset parameters.

**To create a new dataset parameter**

1. From the Quick start page, choose **Data** on the left, choose the ellipsis (three dots) next to the dataset that you want to change, and then choose **Edit**.

1. On the **Dataset** page that opens, choose **Parameters** on the left, and then choose the (\$1) icon to create a new dataset parameter.

1. In the **Create new parameter** pop-up that appears, enter a parameter name in the **Name** box.

1. In the **Data type** dropdown, choose the parameter data type that you want. Supported data types are `String`, `Integer`, `Number`, and `Datetime`. This option can't be changed after the parameter is created.

1. For **Default value**, enter the default value that you want the parameter to have.
**Note**  
When you map a dataset parameter to an analysis parameter, a different default value can be chosen. When this happens, the default value configured here is overridden by the new default value.

1. For **Values**, choose the value type that you want the parameter to have. **Single value** parameters support single–select dropdowns, text field, and list controls. **Multiple values** parameters support multi–select dropdown controls. This option can't be changed after the parameter is created.

1. When you are finished configuring the new parameter, choose **Create** to create the parameter.

# Inserting dataset parameters into custom SQL


You can insert dataset parameters into the custom SQL of a dataset in direct query mode by referencing it with `<<$parameter_name>>` in the SQL statement. At runtime, dashboard users can enter filter control values that are associated with a dataset parameter. Then, they can see the results in the dashboard visuals after the values propagate to the SQL query. You can use parameters to create basic filters based on customer input in `where` clauses. Alternatively, you could add `case when` or `if else` clauses to dynamically change the logic of the SQL query based on a parameter's input.

For example, say you want to add a `WHERE` clause to your custom SQL that filters data based on an end user's Region name. In this case, you create a single value parameter called `RegionName`:

```
SELECT *
FROM transactions
WHERE region = <<$RegionName>>
```

You can also let users provide multiple values to the parameter:

```
SELECT *
FROM transactions
WHERE region in (<<$RegionNames>>)
```

In the following more complex example, a dataset author refers to two dataset parameters twice based on a user's first and last names that can be selected in a dashboard filter control:

```
SELECT Region, Country, OrderDate, Sales
FROM transactions
WHERE region=
(Case
WHEN <<$UserFIRSTNAME>> In 
    (select firstname from user where region='region1') 
    and <<$UserLASTNAME>> In 
    (select lastname from user where region='region1') 
    THEN 'region1'
WHEN <<$UserFIRSTNAME>> In 
    (select firstname from user where region='region2') 
    and <<$UserLASTNAME>> In 
    (select lastname from user where region='region2') 
    THEN 'region2'
ELSE 'region3'
END)
```

You can also use parameters in `SELECT` clauses to create new columns in a dataset from user input:

```
SELECT Region, Country, date, 
    (case 
    WHEN <<$RegionName>>='EU'
    THEN sum(sales) * 0.93   --convert US dollar to euro
    WHEN <<$RegionName>>='CAN'
    THEN sum(sales) * 0.78   --convert US dollar to Canadian Dollar
    ELSE sum(sales) -- US dollar
    END
    ) as "Sales"
FROM transactions
WHERE region = <<$RegionName>>
```

To create a custom SQL query or to edit an existing query before adding a dataset parameter, see [Using SQL to customize data](adding-a-SQL-query.md).

When you apply custom SQL with a dataset parameter, `<<$parameter_name>>` is used as a placeholder value. When a user chooses one of the parameter values from a control, Quick replaces the placeholder with the values that the user selects on the dashboard.

In the following example, the user enters a new custom SQL query that filters data by state:

```
select * from all_flights
where origin_state_abr = <<$State>>
```

The default value of the parameter is applied to the SQL query and the results appear in the **Preview pane**.

# Adding dataset parameters to calculated fields


You can also add dataset parameters to calculated field expressions using the format `${parameter_name}`.

When you create a calculation, you can choose from the existing parameters from the list of parameters under the **Parameters** list. You can't create a calculated field that contains a multivalued parameter.

For more information on adding calculated fields, see [Using calculated fields with parameters in Amazon Quick](parameters-calculated-fields.md).

# Adding dataset parameters to filters


For datasets in direct query mode, dataset authors can use dataset parameters in filters without custom SQL. Dataset parameters can't be added to filters if the dataset is in SPICE.

**To add a dataset parameter to a filter**

1. Open the dataset page of the dataset that you want to create a filter for. Choose **Filters** on the left, and then choose **Add filter**.

1. Enter the name that you want the filter to have and choose the field that you want filtered in the dropdown.

1. After you create the new filter, navigate to the filter in the **Filters** pane, choose the ellipsis (three dots) next to the filter, and then choose **Edit**.

1. For **Filter type**, choose **Custom filter**.

1. For **Filter condition**, choose the condition that you want.

1. Select the **Use parameter** box and choose the dataset parameter that you want the filter to use.

1. When you are finished making changes, choose **Apply**.

# Using dataset parameters in Quick analyses


Once you create a dataset parameter, after you add the dataset to an analysis, map the dataset parameter to a new or existing analysis parameter. After you map a dataset parameter to an analysis parameter, you can use them with filters, controls, and any other analysis parameter feature.

You can manage your dataset parameters in the **Parameters** pane of the analysis that is using the dataset that the parameters belong to. In the **Dataset Parameters** section of the **Parameters** pane, you can choose to see only the unmapped dataset parameters (default). Alternatively, you can choose to see all mapped and unmapped dataset parameters by choosing **ALL** from the **Viewing** dropdown.

## Mapping dataset parameters in new Quick analyses


When you create a new analysis from a dataset that contains parameters, you need to map the dataset parameters to the analysis before you can use them. This is also true when you add a dataset with parameters to an analysis. You can view all unmapped parameters in an analysis in the **Parameters** pane of the analysis. Alternatively, choose **VIEW** in the notification message that appears in the top right of the page when you create the analysis or add the dataset.

**To map a dataset parameter to an analysis parameter**

1. Open the [Quick console](https://quicksight.aws.amazon.com/).

1. Choose the analysis that you want to change.

1. Choose the **Parameters** icon to open the **Parameters** pane.

1. Choose the ellipsis (three dots) next to the dataset parameter that you want to map, choose **Map Parameter**, and then choose the analysis parameter that you want to map your dataset parameter to.

   If your analysis doesn't have any analysis parameters, you can choose **Map parameter** and **Create new** to create an analysis parameter that is automatically mapped to the dataset parameter upon creation.

   1. (Optional) In the **Create new parameter** pop-up that appears, for **Name**, enter a name for the new analysis parameter.

   1. (Optional) For **Static default value**, choose the static default value that you want the parameter to have.

   1. (Optional) Choose **Set a dynamic default** to set a dynamic default for the new parameter.

   1. In the **Mapped dataset parameters** table, you will see the dataset parameter that you are mapping to the new analysis parameter. You can add other dataset parameters to this analysis parameter by choosing the **ADD DATASET PARAMETER** dropdown and then choosing the parameter that you want to map. You can unmap a dataset parameter by choosing the **Remove** button next to the dataset parameter that you want to remove.

   For more information on creating analysis parameters, see [Setting up parameters in Amazon Quick](parameters-set-up.md).

When you map a dataset parameter to an analysis parameter, the analysis parameter represents the dataset parameter wherever it is used in the analysis.

You can also map and unmap dataset parameters to analysis parameters in the **Edit parameter** window. To open the **Edit parameter** window, navigate to the **Parameters** pane, choose the ellipsis (three dots) next to the analysis parameter that you want to change, and then choose **Edit parameter**. You can add other dataset parameters to this analysis parameter by choosing the **ADD DATASET PARAMETER** dropdown and then choosing the parameter that you want to map. You can unmap a dataset parameter by choosing the **Remove** button next to the dataset parameter that you want to remove. You can also remove all mapped dataset parameters by choosing **REMOVE ALL**. When you are done making changes, choose **Update**.

When you delete an analysis parameter, all dataset parameters are unmapped from the analysis and appear in the **UNMAPPED** section of the **Parameters** pane. You can only map a dataset parameter to one analysis parameter at a time. To map a dataset parameter to a different analysis parameter, unmap the dataset parameter and then map it to the new analysis parameter.

## Adding filter controls to mapped analysis parameters


After you map a dataset parameter to an analysis parameter in Quick, you can create filter controls for filters, actions, calculated fields, titles, descriptions, and URLs.

**To add a control to a mapped parameter**

1. In the **Parameters** pane of the analysis page, choose the ellipsis (three dots) next to the mapped analysis parameter that you want, and then choose **Add control**.

1. In the **Add control** window that appears, enter the **Name** that you want and choose the **Style** that you want the control to have. For single value controls, choose between `Dropdown`, `List`, and `Text field`. For multivalue controls, choose `Dropdown`.

1. Choose **Add** to create the control.

# Advanced use cases of dataset parameters
Advanced use

This section covers more advanced options and use cases working with dataset parameters and dropdown controls. Use the following walkthroughs to create dynamic dropdown values with dataset parameters.

## Using multivalue controls with dataset parameters


When you use dataset parameters that are inserted into the custom SQL of a dataset, the dataset parameters commonly filter data by values from a specific column. If you create a dropdown control and assign the parameter as the value, the dropdown only shows the value that the parameter filtered. The following procedure shows how you can create a control that is mapped to a dataset parameter and shows all unfiltered values.

**To populate all assigned values in a dropdown control**

1. Create a new single–column dataset in SPICE or direct query that includes all unique values from the original dataset. For example, let's say that your original dataset is using the following custom SQL:

   ```
   select * from all_flights
           where origin_state_abr = <<$State>>
   ```

   To create a single–column table with all unique origin states, apply the following custom SQL to the new dataset:

   ```
   SELECT distinct origin_state_abr FROM all_flights
           order by origin_state_abr asc
   ```

   The SQL expression returns all unique states in alphabetic order. The new dataset does not have any dataset parameters.

1. Enter a **Name** for the new dataset, and then save and publish the dataset. In our example, the new dataset is called `State Codes`.

1. Open the analysis that contains the original dataset, and add the new dataset to the analysis. For information on adding datasets to an existing analysis, see [Adding a dataset to an analysis](adding-a-data-set-to-an-analysis.md).

1. Navigate to the **Controls** pane and find the dropdown control that you want to edit. Choose the ellipsis (three dots) next to the control, and then choose **Edit**.

1. In the **Format control** that appears on the left, and choose **Link to a dataset field** in the **Values** section.

1. For the **Dataset** dropdown that appears, choose the new dataset that you created. In our example, the `State Codes` dataset is chosen.

1. For the **Field** dropdown that appears, choose the appropriate field. In our example, the `origin_state_abr` field is chosen.

After you finish linking the control to the new dataset, all unique values appear in the control's dropdown. These include the values that are filtered out by the dataset parameter.

## Using controls with Select all options


By default, when one or more dataset parameters are mapped to an analysis parameter and added to a control, the `Select all` option is not available. The following procedure shows a workaround that uses the same example scenario from the previous section.

**Note**  
This walkthrough is for datasets that are small enough to load in direct query. If you have a large dataset and want to use the `Select All` option, it is recommended that you load the dataset into SPICE. However, if you want to use the `Select All` option with dataset parameters, this walkthrough describes a way to do so.

To begin, let's say you have a direct query dataset with custom SQL that contains a multivalue parameter called `States`:

```
select * from all_flights
where origin_state_abr in (<<$States>>)
```

**To use the Select all option in a control that uses dataset parameters**

1. In the **Parameters** pane of the analysis, find the dataset parameter that you want to use and choose **Edit** from the ellipsis (three dots) next to the parameter.

1. In the **Edit parameter** window that appears, enter a new default value in the **Static multiple default values** section. In our example, the default value is ` All States`. Note that the example uses a leading space character so that the default value appears as the first item in the control.

1. Choose **Update** to update the parameter.

1. Navigate to the dataset that contains the dataset parameter that you're using in the analysis-by-analysis. Edit the custom SQL of the dataset to include a default use case for your new static multiple default values. Using the ` All States` example, the SQL expression appears as follows:

   ```
   select * from public.all_flights
   where
       ' All States' in (<<$States>>) or
       origin_state_abr in (<<$States>>)
   ```

   If the user chooses ` All States` in the control, the new SQL expression returns all unique records. If the user chooses a different value from the control, the query returns values that were filtered by the dataset parameter.

### Using controls with Select all and multivalue options


You can combine the previous `Select all` procedure with the multivalue control method discussed earlier to create dropdown controls that contain a `Select all` value in addition to multiple values that the user can select. This walkthrough assumes that you have followed the previous procedures, that you know how to map dataset parameters to analysis parameters, and that you can create controls in an analysis. For more information on mapping analysis parameters, see [Mapping dataset parameters in new Quick analyses](dataset-parameters-analysis.md#dataset-parameters-map-to-analysis). For more information on creating controls in an analysis that is using dataset parameters, see [Adding filter controls to mapped analysis parameters](dataset-parameters-analysis.md#dataset-parameters-analysis-filter-control).

**To add multiple values to a control with a Select all option and a mapped dataset parameter**

1. Open the analysis that has the original dataset with a `Select all` custom SQL expression and a second dataset that includes all possible values of the filtered column that exists in the original dataset.

1. Navigate to the secondary dataset that was created earlier to return all values of a filtered column. Add a custom SQL expression that adds your previously configured `Select all` option to the query. The following example adds the ` All States` record to the top of the list of returned values of the dataset:

   ```
   (Select ' All States' as origin_state_abr)
       Union All
       (SELECT distinct origin_state_abr FROM all_flights
       order by origin_state_abr asc)
   ```

1. Go back to the analysis that the datasets belong to and map the dataset parameter that you are using to the analysis parameter that you created in step 3 of the previous procedure. The analysis parameter and dataset parameter can have the same name. In our example, the analysis parameter is called `States`.

1. Create a new filter control or edit an existing filter control and choose **Hide Select All** to hide the disabled **Select All** option that appears in multivalue controls.

Once you create the control, users can use the same control to select all or multiple values of a filtered column in a dataset.

# Using row-level security in Amazon Quick
Using row-level security


|  | 
| --- |
|  Applies to:  Enterprise Edition  | 

In the Enterprise edition of Amazon Quick, you can restrict access to a dataset by configuring row-level security (RLS) on it. You can do this before or after you have shared the dataset. When you share a dataset with RLS with dataset owners, they can still see all the data. When you share it with readers, however, they can only see the data restricted by the permission dataset rules.

Also, when you embed Amazon Quick dashboards in your application for unregistered users of Quick, you can use row-level security (RLS) to filter/restrict data with tags. A tag is a user-specified string that identifies a session in your application. You can use tags to implement RLS controls for your datasets. By configuring RLS-based restrictions in datasets, Quick filters the data based on the session tags tied to the user identity/session.

You can restrict access to a dataset using username or group-based rules, tag-based rules, or both.

Choose user-based rules if you want to secure data for users or groups provisioned (registered) in Quick. To do so, select a permissions dataset that contains rules set by columns for each user or group accessing the data. Only users or groups identified in the rules have access to data.

Choose tag-based rules only if you are using embedded dashboards and want to secure data for users not provisioned (unregistered users) in Quick. To do so, define tags on columns to secure data. Values to tags must be passed when embedding dashboards.

**Topics**
+ [

# Using row-level security with user-based rules to restrict access to a dataset
](restrict-access-to-a-data-set-using-row-level-security.md)
+ [

# Using row-level security with tag-based rules to restrict access to a dataset when embedding dashboards for anonymous users
](quicksight-dev-rls-tags.md)

# Using row-level security with user-based rules to restrict access to a dataset
Using user-based rules


|  | 
| --- |
|  Applies to:  Enterprise Edition  | 

In the Enterprise edition of Amazon Quick, you can restrict access to a dataset by configuring row-level security (RLS) on it. You can do this before or after you have shared the dataset. When you share a dataset with RLS with dataset owners, they can still see all the data. When you share it with readers, however, they can only see the data restricted by the permission dataset rules. By adding row-level security, you can further control their access.

**Note**  
When applying SPICE datasets to row-level security, each field in the dataset can contain up to 2,047 Unicode characters. Fields that contain more than this quota are truncated during ingestion. To learn more about SPICE data quotas, see [SPICE quotas for imported data](data-source-limits.md#spice-limits).

To do this, you create a query or file with one column for user or group identification. You can use either `UserName` and `GroupName`, or alternatively `UserARN` and `GroupARN`. You can think of this as *adding a rule* for that user or group. Then you can add one column to the query or file for each field that you want to grant or restrict access to. For each user or group name that you add, you add the values for each field. You can use NULL (no value) to mean all values. To see examples of dataset rules, see [Creating dataset rules for row-level security](#create-data-set-rules-for-row-level-security).

To apply the dataset rules, you add the rules as a permissions dataset to your dataset. Keep in mind the following points:
+ The permissions dataset can't contain duplicate values. Duplicates are ignored when evaluating how to apply the rules.
+ Each user or group specified can see only the rows that *match* the field values in the dataset rules. 
+ If you add a rule for a user or group and leave all other columns with no value (NULL), you grant them access to all the data. 
+ If you don't add a rule for a user or group, that user or group can't see any of the data. 
+ The full set of rule records that are applied per user must not exceed 999. This limitation applies to the total number of rules that are directly assigned to a username, plus any rules that are assigned to the user through group names. 
+ If a field includes a comma (,) Amazon Quick treats each word separated from another by a comma as an individual value in the filter. For example, in `('AWS', 'INC')`, `AWS,INC` is considered as two strings: `AWS` and `INC`. To filter with `AWS,INC`, wrap the string with double quotation marks in the permissions dataset. 

  If the restricted dataset is a SPICE dataset, the number of filter values applied per user can't exceed 192,000 for each restricted field. This applies to the total number of filter values that are directly assigned to a username, plus any filter values that are assigned to the user through group names.

  If the restricted dataset is a direct query dataset, the number of filter values applied per user varies from data sources.

  Exceeding the filter value limit may cause the visual rendering to fail. We recommend adding an additional column to your restricted dataset to divide the rows into groups based on the original restricted column so that the filter list can be shortened.

Amazon Quick treats spaces as literal values. If you have a space in a field that you are restricting, the dataset rule applies to those rows. Amazon Quick treats both NULLs and blanks (empty strings "") as "no value". A NULL is an empty field value. 

Depending on what data source your dataset is coming from, you can configure a direct query to access a table of permissions. Terms with spaces inside them don't need to be delimited with quotes. If you use a direct query, you can easily change the query in the original data source. 

Or you can upload dataset rules from a text file or spreadsheet. If you are using a comma-separated value (CSV) file, don't include any spaces on the given line. Terms with spaces inside them need to be delimited with quotation marks. If you use dataset rules that are file-based, apply any changes by overwriting the existing rules in the dataset's permissions settings.

Datasets that are restricted are marked with the word **RESTRICTED** in the **Data** screen.

Child datasets that are created from a parent dataset that has RLS rules active retain the same RLS rules that the parent dataset has. You can add more RLS rules to the child dataset, but you can't remove the RLS rules that the dataset inherits from the parent dataset. 

Child datasets that are created from a parent dataset that has RLS rules active can only be created with Direct Query. Child datasets that inherit the parent dataset's RLS rules aren't supported in SPICE.

Row-level security works only for fields containing textual data (string, char, varchar, and so on). It doesn't currently work for dates or numeric fields. Anomaly detection is not supported for datasets that use row-level security (RLS).

## Creating dataset rules for row-level security


Use the following procedure to create a permissions file or query to use as dataset rules.

**To create a permissions files or query to use as dataset rules**

1. Create a file or a query that contains the dataset rules (permissions) for row-level security. 

   It doesn't matter what order the fields are in. However, all the fields are case-sensitive. Make sure that they exactly match the field names and values. 

   The structure should look similar to one of the following. Make sure that you have at least one field that identifies either users or groups. You can include both, but only one is required, and only one is used at a time. The field that you use for users or groups can have any name you choose.
**Note**  
If you are specifying groups, use only Amazon Quick groups or Microsoft AD groups. 

   The following example shows a table with groups.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/restrict-access-to-a-data-set-using-row-level-security.html)

   The following example shows a table with usernames.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/restrict-access-to-a-data-set-using-row-level-security.html)

   The following example shows a table with user and group Amazon Resource Names (ARNs).    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/restrict-access-to-a-data-set-using-row-level-security.html)

   Or if you use a .csv file, the structure should look similar to one of the following.

   ```
   UserName,SalesRegion,Segment
   AlejandroRosalez,EMEA,"Enterprise,SMB,Startup"
   MarthaRivera,US,Enterprise
   NikhilJayashankars,US,SMB
   PauloSantos,US,Startup
   SaanviSarkar,APAC,"SMB,Startup"
   sales-tps@example.com,"",""
   ZhangWei,APAC-Sales,"Enterprise,Startup"
   ```

   ```
   GroupName,SalesRegion,Segment
   EMEA-Sales,EMEA,"Enterprise,SMB,Startup"
   US-Sales,US,Enterprise
   US-Sales,US,SMB
   US-Sales,US,Startup
   APAC-Sales,APAC,"SMB,Startup"
   Corporate-Reporting,"",""
   APAC-Sales,APAC,"Enterprise,Startup"
   ```

   ```
   UserARN,GroupARN,SalesRegion
   arn:aws:quicksight:us-east-1:123456789012:user/Bob,arn:aws:quicksight:us-east-1:123456789012:group/group-1,APAC
   arn:aws:quicksight:us-east-1:123456789012:user/Sam,arn:aws:quicksight:us-east-1:123456789012:group/group-2,US
   ```

   Following is a SQL example.

   ```
   /* for users*/
   	select User as UserName, SalesRegion, Segment
   	from tps-permissions;
   
   	/* for groups*/
   	select Group as GroupName, SalesRegion, Segment
   	from tps-permissions;
   ```

1. Create a dataset for the dataset rules. To make sure that you can easily find it, give it a meaningful name, for example **Permissions-Sales-Pipeline**.

## Rules Dataset flagging for row-level security


Use the following procedure to appropriately flag a dataset as a rules dataset.

Rules Dataset is a flag that distinguishes permission datasets used for row-level security from regular datasets. If a permissions dataset was applied to a regular dataset before March 31, 2025, it will have a Rules Dataset flag in the **Dataset** landing page. 

If a permissions dataset was not applied to a regular dataset by March 31, 2025, it will be categorized as a regular dataset. To use it as a rules dataset, duplicate the permissions dataset and flag it as a rules dataset on the console when creating the dataset. Select EDIT DATASET and under the options, choose DUPLICATE AS RULES DATASET. 

To successfully duplicate it as a rules dataset, ensure the original dataset has: 1. Required user metadata or group metadata column(s) and 2. Only string type columns.

To create a new rules dataset on the console, select NEW RULES DATASET under the NEW DATASET dropdown. When creating a rules dataset programmatically, add the following parameter: [UseAs: RLS\$1RULES](https://docs.aws.amazon.com/quicksight/latest/APIReference/API_CreateDataSet.html#API_CreateDataSet_RequestSyntax). This is an optional parameter that is only used to create a rules dataset. Once a dataset has been created, either through the console or programmatically, and flagged as either a rules dataset or a regular dataset, it cannot be changed.

Once datasets are flagged as rules datasets, Amazon Quick will apply strict SPICE ingestion rules on them. To ensure data integrity, SPICE ingestions for rules datasets will fail if there are invalid rows or cells exceeding length limits. You must fix the ingestion issues in order to re-initiate a successful ingestion. Strict ingestion rules are only applicable to rules datasets. Regular datasets will not have dataset ingestion failures when there are skipped rows or string truncations. 

## Applying row-level security


Use the following procedure to apply row-level security (RLS) by using a file or query as a dataset that contains the rules for permissions. 

**To apply row-level security by using a file or query**

1. Confirm that you have added your rules as a new dataset. If you added them, but don't see them under the list of datasets, refresh the screen.

1. On the **Data** page, choose the dataset

1. On the dataset details page that opens, for **Row-level security**, choose **Set up**.

1. On the **Set up row-level security** page that opens, choose **User-based rules**.

1. From the list of datasets that appears, choose your permissions dataset. 

   If your permissions dataset doesn't appear on this screen, return to your datasets, and refresh the page.

1. For **Permissions policy** choose **Grant access to dataset**. Each dataset has only one active permissions dataset. If you try to add a second permissions dataset, it overwrites the existing one.
**Important**  
Some restrictions apply to NULL and empty string values when working with row-level security:  
If your dataset has NULL values or empty strings ("") in the restricted fields, these rows are ignored when the restrictions are applied. 
Inside the permissions dataset, NULL values and empty strings are treated the same. For more information, see the following table.
To prevent accidentally exposing sensitive information, Amazon Quick skips empty RLS rules that grant access to everyone. An *empty RLS rule* occurs when all columns of a row have no value. Quick RLS treats NULL, empty strings (""), or empty comma separated strings (for example ",,,") as no value.  
After skipping empty rules, other nonempty RLS rules still apply.
If a permission dataset has only empty rules and all of them were skipped, no one will have access to any data restricted by this permission dataset.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/restrict-access-to-a-data-set-using-row-level-security.html)

   Anyone whom you shared your dashboard with can see all the data in it, unless the dataset is restricted by dataset rules. 

1. Choose **Apply dataset** to save your changes. Then, on the **Save data set rules?** page, choose **Apply and activate**. Changes in permissions apply immediately to existing users. 

1. (Optional) To remove permissions, first remove the dataset rules from the dataset. 

   Make certain that the dataset rules are removed. Then, choose the permissions dataset and choose **Remove data set**.

   To overwrite permissions, choose a new permissions dataset and apply it. You can reuse the same dataset name. However, make sure to apply the new permissions in the **Permissions** screen to make these permissions active. SQL queries dynamically update, so these can be managed outside of Amazon Quick. For queries, the permissions are updated when the direct query cache is automatically refreshed.

If you delete a file-based permissions dataset before you remove it from the target dataset, restricted users can't access the dataset. While the dataset is in this state, it remains marked as **RESTRICTED**. However, when you view **Permissions** for that dataset, you can see that it has no selected dataset rules. 

To fix this, specify new dataset rules. Creating a dataset with the same name is not enough to fix this. You must choose the new permissions dataset on the **Permissions** screen. This restriction doesn't apply to direct SQL queries.

# Using row-level security with tag-based rules to restrict access to a dataset when embedding dashboards for anonymous users
Using tag-based rules


|  | 
| --- |
|  Applies to:  Enterprise Edition  | 


|  | 
| --- |
|    Intended audience:  Amazon Quick Administrators and Amazon Quick developers  | 

When you embed Amazon Quick dashboards in your application for users who are not provisioned (registered) in Quick, you can use row-level security (RLS) to filter/restrict data with tags. A tag is a user-specified string that identifies a session in your application. You can use tags to implement RLS controls for your datasets. By configuring RLS-based restrictions in datasets, Quick filters the data based on the session tags tied to the user identity/session.

For example, let's say you're a logistics company that has a customer-facing application for various retailers. Thousands of users from these retailers access your application to see metrics related to how their orders are getting shipped from your warehouse. 

You don't want to manage thousands of users in Quick, so you use anonymous embedding to embed the selected dashboards in your application that your authenticated and authorized users can see. However, you want to make sure retailers see only data that is for their business and not for others. You can use RLS with tags to make sure your customers only see data that's relevant to them.

To do so, complete the following steps:

1. Add RLS tags to a dataset.

1. Assign values to those tags at runtime using the `GenerateEmbedUrlForAnonymousUser` API operation.

   For more information about embedding dashboards for anonymous users using the `GenerateEmbedUrlForAnonymousUser` API operation, see [Embedding Amazon Quick Sight dashboards for anonymous (unregistered) users](embedded-analytics-dashboards-for-everyone.md).

Before you can use RLS with tags, keep in mind the following points:
+ Using RLS with tags is currently only supported for anonymous embedding, specifically for embedded dashboards that use the `GenerateEmbedUrlForAnonymousUser` API operation.
+ Using RLS with tags isn't supported for embedded dashboards that use the `GenerateEmbedURLForRegisteredUser` API operation or the old `GetDashboardEmbedUrl` API operation.
+ RLS tags aren't supported with AWS Identity and Access Management (IAM) or the Quick identity type.
+ When applying SPICE datasets to row-level security, each field in the dataset can contain up to 2,047 Unicode characters. Fields that contain more than this quota are truncated during ingestion. To learn more about SPICE data quotas, see [SPICE quotas for imported data](data-source-limits.md#spice-limits).

## Step 1: Add RLS tags to a dataset


You can add tag-based rules to a dataset in Amazon Quick. Alternatively, you can call the `CreateDataSet` or `UpdateDataSet` API operation and add tag-based rules that way. For more information, see [Add RLS tags to a dataset using the API](#quicksight-dev-rls-tags-add-api).

Use the following procedure to add RLS tags to a dataset in Quick.

**To add RLS tags to a dataset**

1. From the Quick start page, choose **Data** at left.

1. Choose the dataset that you want to add RLS to.

1. On the dataset details page that opens, for **Row-level security**, choose **Set up**.

1. On the **Set up row-level security** page that opens, choose **Tag-based rules**.

1. For **Column**, choose a column that you want to add tag rules to.

   For example, in the case for the logistics company, the `retailer_id` column is used.

   Only columns with a string data type are listed.

1. For **Tag**, enter a tag key. You can enter any tag name that you want.

   For example, in the case for the logistics company, the tag key `tag_retailer_id` is used. Doing this sets row-level security based on the retailer that's accessing the application.

1. (Optional) For **Delimiter**, choose a delimiter from the list, or enter your own.

   You can use delimiters to separate text strings when assigning more than one value to a tag. The value for a delimiter can be 10 characters long, at most.

1. (Optional) For **Match all**, choose the **\$1**, or enter your own character or characters.

   This option can be any character that you want to use when you want to filter by all the values in that column in the dataset. Instead of listing the values one by one, you can use the character. If this value is specified, it can be at least one character, or at most 256 characters long.

1. Choose **Add**.

   The tag rule is added to the dataset and is listed at the bottom, but it isn't applied yet. To add another tag rule to the dataset, repeat steps 5–9. To edit a tag rule, choose the pencil icon that follows the rule. To delete a tag rule, choose the delete icon that follows the rule. You can add up to 50 tags to a dataset.

1. When you're ready to apply the tag rules to the dataset, choose **Apply rules**.

1. On the **Turn on tag-based security?** page that opens, choose **Apply and activate**.

   The tag-based rules are now active. On the **Set up row-level security**page, a toggle appears for you to turn tag rules on and off for the dataset.

   To turn off all tag-based rules for the dataset, switch the **Tag-Based rules** toggle off, and then enter "confirm" in the text box that appears.

   On the **Data** page, a lock icon appears in the dataset row to indicate that tag rules are enabled.

   You can now use tag rules to set tag values at runtime, described in [Step 2: Assign values to RLS tags at runtime](#quicksight-dev-rls-tags-assign-values). The rules only affect Quick readers when active.
**Important**  
After tags are assigned and enabled on the dataset, make sure to give Quick authors permissions to see any of the data in the dataset when authoring a dashboard.   
To give Quick authors permission to see data in the dataset, create a permissions file or query to use as dataset rules. For more information, see [Creating dataset rules for row-level security](restrict-access-to-a-data-set-using-row-level-security.md#create-data-set-rules-for-row-level-security).

After you create a tag-based rule, a new **Manage rules** table appears that shows how your tag-based rules relate to each other. To make changes to the rules listed in the **Manage rules** table, choose the pencil icon that follows the rule. Then add or remove tags, and choose **Update**. To apply your updated rule to the dataset, choose **Apply**.

### (Optional) Add the OR condition to RLS tags


You can also add the OR condition to your tag-based rules to further customize the way data is presented to your Quick account users. When you use the OR condition with your tag-based rules, visuals in Quick appear if at least one tag defined in the rule is valid.

**To add the OR condition to your tag-based rules**

1. In the **Manage rules** table, choose **Add OR condition**.

1. In the **Select tag** dropdown list that appears, choose the tag that you want to create an OR condition for. You can add up to 50 OR conditions to the **Manage rules** table. You can add multiple tags to a single column in a dataset, but at least one column tag needs to be included in a rule.

1. Choose **Update** to add the condition to your rule, then choose **Apply** to apply the updated rule to your dataset.

### Add RLS tags to a dataset using the API


Alternatively, you can configure and enable tag-based row-level security on your dataset by calling the `CreateDataSet` or `UpdateDataSet` API operation. Use the following examples to learn how.

**Important**  
When configuring session tags in the API call,  
Treat session tags as security credentials. Do not expose session tags to end users or client-side code.
Implement server-side controls. Ensure that session tags are set exclusively by your trusted backend services, not by parameters that end users can modify.
Protect session tags from enumeration. Ensure that users in one tenant cannot discover or guess sessionTag values belonging to other tenants.
Review your architecture. If downstream customers or partners are allowed to call the API directly, evaluate whether those parties could specify sessionTag values for tenants they should not access.

------
#### [ CreateDataSet ]

The following is an example for creating a dataset that uses RLS with tags. It assumes the scenario of the logistics company described previously. The tags are defined in the `row-level-permission-tag-configuration` element. The tags are defined on the columns that you want to secure the data for. For more information about this optional element, see [RowLevelPermissionTagConfiguration](https://docs.aws.amazon.com/quicksight/latest/APIReference/API_RowLevelPermissionTagConfiguration.html) in the *Amazon Quick API Reference*.

```
create-data-set
		--aws-account-id <value>
		--data-set-id <value>
		--name <value>
		--physical-table-map <value>
		[--logical-table-map <value>]
		--import-mode <value>
		[--column-groups <value>]
		[--field-folders <value>]
		[--permissions <value>]
		[--row-level-permission-data-set <value>]
		[--column-level-permission-rules <value>]
		[--tags <value>]
		[--cli-input-json <value>]
		[--generate-cli-skeleton <value>]
		[--row-level-permission-tag-configuration 
	'{
		"Status": "ENABLED",
		"TagRules": 
			[
				{
					"TagKey": "tag_retailer_id",
					"ColumnName": "retailer_id",
					"TagMultiValueDelimiter": ",",
					"MatchAllValue": "*"
				},
				{
					"TagKey": "tag_role",
					"ColumnName": "role"
				}
			],
		"TagRuleConfigurations":
			[
				tag_retailer_id
			],
			[
				tag_role
			]
	}'
]
```

The tags in this example are defined in the `TagRules` part of the element. In this example, two tags are defined based on two columns:
+ The `tag_retailer_id` tag key is defined for the `retailer_id` column. In this case for the logistics company, this sets row-level security based on the retailer that's accessing the application.
+ The `tag_role` tag key is defined for the `role` column. In this case for the logistics company, this sets an additional layer of row-level security based on the role of the user accessing your application from a specific retailer. An example is `store_supervisor` or `manager`.

For each tag, you can define `TagMultiValueDelimiter` and `MatchAllValue`. These are optional.
+ `TagMultiValueDelimiter` – This option can be any string that you want to use to delimit the values when you pass them at runtime. The value can be 10 characters long, at most. In this case, a comma is used as the delimiter value.
+ `MatchAllValue` – This option can be any character that you want to use when you want to filter by all the values in that column in the dataset. Instead of listing the values one by one, you can use the character. If specified, this value can be at least one character, or at most 256 characters long. In this case, an asterisk is used as the match all value.

While configuring the tags for dataset columns, turn them on or off using the mandatory property `Status`. For enabling the tag rules use the value `ENABLED` for this property. By turning on tag rules, you can use them to set tag values at runtime, described in [Step 2: Assign values to RLS tags at runtime](#quicksight-dev-rls-tags-assign-values).

The following is an example of the response definition.

```
{
			"Status": 201,
			"Arn": "arn:aws:quicksight:us-west-2:11112222333:dataset/RLS-Dataset",
			"DataSetId": "RLS-Dataset",
			"RequestId": "aa4f3c00-b937-4175-859a-543f250f8bb2"
		}
```

------
#### [ UpdateDataSet ]

**UpdateDataSet**

You can use the `UpdateDataSet` API operation to add or update RLS tags for an existing dataset.

The following is an example of updating a dataset with RLS tags. It assumes the scenario of the logistics company described previously.

```
update-data-set
		--aws-account-id <value>
		--data-set-id <value>
		--name <value>
		--physical-table-map <value>
		[--logical-table-map <value>]
		--import-mode <value>
		[--column-groups <value>
		[--field-folders <value>]
		[--row-level-permission-data-set <value>]
		[--column-level-permission-rules <value>]
		[--cli-input-json <value>]
		[--generate-cli-skeleton <value>]
				[--row-level-permission-tag-configuration 
	'{
		"Status": "ENABLED",
		"TagRules": 
			[
				{
					"TagKey": "tag_retailer_id",
					"ColumnName": "retailer_id",
					"TagMultiValueDelimiter": ",",
					"MatchAllValue": "*"
				},
				{
					"TagKey": "tag_role",
					"ColumnName": "role"
				}
			],
		"TagRuleConfigurations":
			[
				tag_retailer_id
			],
			[
				tag_role
			]
	}'
]
```

The following is an example of the response definition.

```
{
			"Status": 201,
			"Arn": "arn:aws:quicksight:us-west-2:11112222333:dataset/RLS-Dataset",
			"DataSetId": "RLS-Dataset",
			"RequestId": "aa4f3c00-b937-4175-859a-543f250f8bb2"
		}
```

------

**Important**  
After tags are assigned and enabled on the dataset, make sure to give Quick authors permissions to see any of the data in the dataset when authoring a dashboard.   
To give Quick authors permission to see data in the dataset, create a permissions file or query to use as dataset rules. For more information, see [Creating dataset rules for row-level security](restrict-access-to-a-data-set-using-row-level-security.md#create-data-set-rules-for-row-level-security).

For more information about the `RowLevelPermissionTagConfiguration` element, see [RowLevelPermissionTagConfiguration](https://docs.aws.amazon.com/quicksight/latest/APIReference/API_RowLevelPermissionTagConfiguration.html) in the *Amazon Quick API Reference*.

## Step 2: Assign values to RLS tags at runtime


You can use tags for RLS only for anonymous embedding. You can set values for tags using the `GenerateEmbedUrlForAnonymousUser` API operation.

**Important**  
When configuring session tags in the API call,  
Treat session tags as security credentials. Do not expose session tags to end users or client-side code.
Implement server-side controls. Ensure that session tags are set exclusively by your trusted backend services, not by parameters that end users can modify.
Protect session tags from enumeration. Ensure that users in one tenant cannot discover or guess sessionTag values belonging to other tenants.
Review your architecture. If downstream customers or partners are allowed to call the API directly, evaluate whether those parties could specify sessionTag values for tenants they should not access.

The following example shows how to assign values to RLS tags that were defined in the dataset in the previous step.

```
POST /accounts/AwsAccountId/embed-url/anonymous-user
	HTTP/1.1
	Content-type: application/json
	{
		“AwsAccountId”: “string”,
		“SessionLifetimeInMinutes”: integer,
		“Namespace”: “string”, // The namespace to which the anonymous end user virtually belongs
		“SessionTags”:  // Optional: Can be used for row-level security
			[
				{
					“Key”: “tag_retailer_id”,
					“Value”: “West,Central,South”
				}
				{
					“Key”: “tag_role”,
					“Value”: “shift_manager”
				}
			],
		“AuthorizedResourceArns”:
			[
				“string”
			],
		“ExperienceConfiguration”:
			{
				“Dashboard”:
					{
						“InitialDashboardId”: “string”
						// This is the initial dashboard ID the customer wants the user to land on. This ID goes in the output URL.
					}
			}
	}
```

The following is an example of the response definition.

```
HTTP/1.1 Status
	Content-type: application/json

	{
	"EmbedUrl": "string",
	"RequestId": "string"
	}
```

RLS support without registering users in Quick is supported only in the `GenerateEmbedUrlForAnonymousUser` API operation. In this operation, under `SessionTags`, you can define the values for the tags associated with the dataset columns.

In this case, the following assignments are defined:
+ Values `West`, `Central`, and `South` are assigned to the `tag_retailer_id` tag at runtime. A comma is used for the delimiter, which was defined in `TagMultipleValueDelimiter` in the dataset. To use call values in the column, you can set the value to *\$1*, which was defined as the `MatchAllValue` when creating the tag.
+ The value `shift_manager` is assigned to the `tag_role` tag.

The user using the generated URL can view only the rows having the `shift_manager` value in the `role` column. That user can view only the value `West`, `Central`, or `South` in the `retailer_id` column.

For more information about embedding dashboards for anonymous users using the `GenerateEmbedUrlForAnonymousUser` API operation, see [Embedding Amazon Quick Sight dashboards for anonymous (unregistered) users](embedded-analytics-dashboards-for-everyone.md), or [GenerateEmbedUrlForAnonymousUser](https://docs.aws.amazon.com/quicksight/latest/APIReference/API_GenerateEmbedUrlForAnonymousUser.html) in the *Amazon Quick API Reference*

# Using column-level security to restrict access to a dataset
Using column-level security

In the Enterprise edition of Quick, you can restrict access to a dataset by configuring column-level security (CLS) on it. A dataset or analysis with CLS enabled has the restricted ![\[The lock icon for CLS.\]](http://docs.aws.amazon.com/quick/latest/userguide/images/cls-restricted-icon.png) symbol next to it. By default, all users and groups have access to the data. By using CLS, you can manage access to specific columns in your dataset.

If you use an analysis or dashboard that contains datasets with CLS restrictions that you don't have access to, you can't create, view, or edit visuals that use the restricted fields. For most visual types, if a visual has restricted columns that you don't have access to, you can't see the visual in your analysis or dashboard.

Tables and pivot tables behave differently. If a table or pivot table uses restricted columns in the **Rows** or **Columns** field wells, and you don't have access to these restricted columns, you can't see the visual in an analysis or dashboard. If a table or pivot table has restricted columns in the **Values** field well, you can see the table in an analysis or dashboard with only the values that you have access to. The values for restricted columns show as Not Authorized.

To enable column-level security on an analysis or dashboard, you need administrator access.

**To create a new analysis with CLS**

1. On the Quick start page, choose the **Analyses** tab.

1. At upper right, choose **New analysis**.

1. Choose a dataset, and choose **Column-level security**.

1. Select the columns that you want to restrict, and then choose **Next**. By default, all groups and users have access to all columns.

1. Choose who can access each column, and then choose **Apply** to save your changes.

**To use an existing analysis for CLS**

1. On the Quick start page, choose the **Data** tab.

1. On the Data page, open your dataset

1. On the dataset details page that opens, for **Column-level security**, choose **Set up**.

1. Select the columns that you want to restrict, and then choose **Next**. By default, all groups and users have access to all columns.

1. Choose who can access each column, and then choose **Apply** to save your changes.

**To create a dashboard with CLS**

1. On the Quick navigation pane, choose the **Analyses** tab.

1. Choose the analysis that you want to create a dashboard of.

1. At upper right, choose **Publish**.

1. Choose one of the following:
   + To create a new dashboard, choose **Publish new dashboard as** and enter a name for the new dashboard.
   + To replace an existing dashboard, choose **Replace an existing dashboard** and choose the dashboard from the list.

   Additionally, you can choose **Advanced publish options**. For more information, see [Publishing dashboards](creating-a-dashboard.md).

1. Choose **Publish dashboard**.

1. (Optional) Do one of the following:
   + To publish a dashboard without sharing, choose **x** at the upper right of the **Share dashboard with users** screen when it appears. You can share the dashboard later by choosing **Share** from the application bar.
   + To share the dashboard, follow the procedure in [Sharing Amazon Quick Sight dashboards](sharing-a-dashboard.md).

# Running queries as an IAM role in Amazon Quick
Running queries as an IAM role

You can enhance data security by using fine-grained access policies rather than broader permissions for data sources connected to Amazon Athena, Amazon Redshift or Amazon S3. You start by creating an AWS Identity and Access Management (IAM) role with permissions to be activated when a person or an API starts a query. Then, an Quick administrator or a developer assigns the IAM Role to an Athena or Amazon S3 data source. With the role in place, any person or API that runs the query has the exact permissions necessary to run the query. 

Here are some things to consider before you commit to implementing run-as roles to enhance data security: 
+ Articulate how the additional security works to your advantage.
+ Work with your Quick administrator to learn if adding roles to data sources helps you to better meet your security goals or requirements. 
+ Ask if this type of security, for the number of data sources and people and applications involved, can be feasibly documented and maintained by your team? If not, then who will undertake that part of the work?
+ In a structured organization, locate stakeholders in parallel teams in Operations, Development, and IT Support. Ask for their experience, advice, and willingness to support your plan.
+ Before you launch your project, consider doing a proof of concept that involves the people who need access to the data.

The following rules apply to using run-as roles with Athena, Amazon Redshift, and Amazon S3:
+ Each data source can have only one associated RoleArn. Consumers of the data source, who typically access datasets and visuals, can generate many different types of queries. The role places boundaries on which queries work and which don't work.
+ The ARN must correspond to an IAM role in the same AWS account as the Quick instance that uses it.
+ The IAM role must have a trust relationship allowing Quick to assume the role.
+ The identity that calls Quick's APIs must have permission to pass the role before they can update the `RoleArn` property. You only need to pass the role when creating or updating the role ARN. The permissions aren't re-evaluated later on. Similarly, the permission isn't required when the role ARN is omitted.
+ When the role ARN is omitted, the Athena or Amazon S3 data source uses the account-wide role and scope-down policies.
+ When the role ARN is present, the account-wide role and any scope-down policies are both ignored. For Athena data sources, Lake Formation permissions are not ignored.
+ For Amazon S3 data sources, both the manifest file and the data specified by the manifest file must be accessible using the IAM role.
+ The ARN string needs to match an existing IAM role in the AWS account and AWS Region where the data is located and queried. 

When Quick connects to another service in AWS, it uses an IAM role. By default, this less granular version of the role is created by Quick for each service it uses, and the role is managed by AWS account administrators. When you add an IAM role ARN with a custom permissions policy, you override the broader role for your data sources that need extra protection. For more information about policies, see [Create a customer managed policy](https://docs.aws.amazon.com/IAM/latest/UserGuide/tutorial_managed-policies.html) in the IAM User Guide.

## Run queries with Athena data sources
Athena data sources

Use the API to attach the ARN to the Athena data source. To do so, add the role ARN in the [RoleArn](https://docs.aws.amazon.com/quicksight/latest/APIReference/API_RoleArn.html) property of [AthenaParameters](https://docs.aws.amazon.com/quicksight/latest/APIReference/API_AthenaParameters.html). For verification, you can see the role ARN on the **Edit Athena data source** dialog box. However, **Role ARN** is a read-only field.

To get started, you need a custom IAM role, which we demonstrate in the following example.

Keep in mind that the following code example is for learning purposes only. Use this example in a temporary development and testing environment only, and not in a production environment. The policy in this example doesn't secure any specific resource, which must be in a deployable policy. Also, even for development, you need to add your own AWS account information.

The following commands create a simple new role and attach a few policies that grant permissions to Quick.

```
aws iam create-role \
        --role-name TestAthenaRoleForQuickSight \
        --description "Test Athena Role For QuickSight" \
        --assume-role-policy-document '{
            "Version": "2012-10-17"		 	 	 ,
            "Statement": [
                {
                    "Effect": "Allow",
                    "Principal": {
                        "Service": "quicksight.amazonaws.com"
                    },
                    "Action": "sts:AssumeRole"
                }
            ]
        }'
```

After you've identified or created an IAM role to use with each data source, attach the policies by using the attach-role-policy.

```
aws iam attach-role-policy \
        --role-name TestAthenaRoleForQuickSight \
        --policy-arn arn:aws:iam::222222222222:policy/service-role/AWSQuickSightS3Policy1

    aws iam attach-role-policy \
        --role-name TestAthenaRoleForQuickSight \
        --policy-arn arn:aws:iam::aws:policy/service-role/AWSQuicksightAthenaAccess1

    aws iam attach-role-policy \
        --role-name TestAthenaRoleForQuickSight \
        --policy-arn arn:aws:iam::aws:policy/AmazonS3Access1
```


After you verify your permissions, you can use the role in Quick data sources by creating a new role or updating an existing role. When using these commands, update the AWS account ID and AWS Region to match your own. 

Remember, these example code snippets are not for production environments. AWS strongly recommends that you identify and use a set of least privilege policies for your production cases.

```
aws quicksight create-data-source
        --aws-account-id 222222222222 \
        --region us-east-1 \
        --data-source-id "athena-with-custom-role" \
        --cli-input-json '{
            "Name": "Athena with a custom Role",
            "Type": "ATHENA",
            "data sourceParameters": {
                "AthenaParameters": {
                    "RoleArn": "arn:aws:iam::222222222222:role/TestAthenaRoleForQuickSight"
                }
            }
        }'
```

## Run queries with Amazon Redshift data sources
Amazon Redshift data sources

Connect your Amazon Redshift data with the run-as role to enhance your data security with fine-grained access policies. You can create a run-as role for Amazon Redshift data sources that use a public network or a VPC connection. You specify the connection type that you want to use in the **Edit Amazon Redshift data source** dialog box. The run-as role is not supported for Amazon Redshift Serverless data sources.

To get started, you need a custom IAM role, which we demonstrate in the following example. The following commands create a sample new role and attach policies that grant permissions to Quick.

```
aws iam create-role \
--role-name TestRedshiftRoleForQuickSight \
--description "Test Redshift Role For QuickSight" \
--assume-role-policy-document '{
    "Version": "2012-10-17"		 	 	 ,
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "quicksight.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}'
```

After you identify or create an IAM role to use with each data source, attach the policies with an `attach-role-policy`. If the `redshift:GetClusterCredentialsWithIAM` permission is attached to the role that you want to use, the values for `DatabaseUser` and `DatabaseGroups` are optional.

```
aws iam attach-role-policy \
--role-name TestRedshiftRoleForQuickSight \
--policy-arn arn:aws:iam:111122223333:policy/service-role/AWSQuickSightRedshiftPolicy
    
        
aws iam create-policy --policy-name RedshiftGetClusterCredentialsPolicy1 \
--policy-document file://redshift-get-cluster-credentials-policy.json 


aws iam attach-role-policy \
--role-name TestRedshiftRoleForQuickSight \
--policy-arn arn:aws:iam:111122223333:policy/RedshiftGetClusterCredentialsPolicy1
// redshift-get-cluster-credentials-policy.json
{
    "Version": "2012-10-17"		 	 	 ,
    "Statement": [
        {
            "Sid": "RedshiftGetClusterCredentialsPolicy",
            "Effect": "Allow",
            "Action": [
                "redshift:GetClusterCredentials"
            ],
            "Resource": [
                "*"
            ]
        }
    ]
}
```

The example above creates a data source that uses the `RoleARN`, `DatabaseUser`, and `DatabaseGroups` IAM parameters. If you want to establish the connection only through the IAM `RoleARN` parameter, attach the `redshift:GetClusterCredentialsWithIAM` permission to your role, shown in the example below.

```
aws iam attach-role-policy \ 
--role-name TestRedshiftRoleForQuickSight \ 
--policy-arn arn:aws:iam:111122223333:policy/RedshiftGetClusterCredentialsPolicy1 // redshift-get-cluster-credentials-policy.json {
    "Version": "2012-10-17"		 	 	 ,
    "Statement": [ 
        {
            "Sid": "RedshiftGetClusterCredentialsPolicy", 
            "Effect": "Allow", 
            "Action": [ "redshift:GetClusterCredentialsWithIAM" ],
            "Resource": [ "*" ]
        }
    ]
}"
```

After you verify your permissions, you can use the role in Quick data sources by creating a new role or updating an existing role. When using these commands, update the AWS account ID and AWS Region to match your own.

```
aws quicksight create-data-source \
--region us-west-2 \
--endpoint https://quicksight.us-west-2.quicksight.aws.com/ \
--cli-input-json file://redshift-data-source-iam.json \
redshift-data-source-iam.json is shown as below
{
    "AwsAccountId": "AWSACCOUNTID",
    "DataSourceId": "DATSOURCEID",
    "Name": "Test redshift demo iam",
    "Type": "REDSHIFT",
    "DataSourceParameters": {
        "RedshiftParameters": {
            "Database": "integ",
            "Host": "redshiftdemocluster.us-west-2.redshift.amazonaws.com",
            "Port": 8192,
            "ClusterId": "redshiftdemocluster",
            "IAMParameters": {
                "RoleArn": "arn:aws:iam::222222222222:role/TestRedshiftRoleForQuickSight",
                "DatabaseUser": "user",
                "DatabaseGroups": ["admin_group", "guest_group", "guest_group_1"]
            }
        }
    },
    "Permissions": [
      {
        "Principal": "arn:aws:quicksight:us-east-1:AWSACCOUNTID:user/default/demoname",
        "Actions": [
          "quicksight:DescribeDataSource",
          "quicksight:DescribeDataSourcePermissions",
          "quicksight:PassDataSource",
          "quicksight:UpdateDataSource",
          "quicksight:DeleteDataSource",
          "quicksight:UpdateDataSourcePermissions"
        ]
      }
    ]
}
```

If your data source uses the VPC connection type, use the following VPC configuration.

```
{
    "AwsAccountId": "AWSACCOUNTID",
    "DataSourceId": "DATSOURCEID",
    "Name": "Test redshift demo iam vpc",
    "Type": "REDSHIFT",
    "DataSourceParameters": {
        "RedshiftParameters": {
            "Database": "mydb",
            "Host": "vpcdemo.us-west-2.redshift.amazonaws.com",
            "Port": 8192,
            "ClusterId": "vpcdemo",
            "IAMParameters": {
                "RoleArn": "arn:aws:iam::222222222222:role/TestRedshiftRoleForQuickSight",
                "DatabaseUser": "user",
                "AutoCreateDatabaseUser": true
            }
        }
    },
    "VpcConnectionProperties": { 
      "VpcConnectionArn": "arn:aws:quicksight:us-west-2:222222222222:vpcConnection/VPC Name"
    },
    "Permissions": [
      {
        "Principal": "arn:aws:quicksight:us-east-1:222222222222:user/default/demoname",
        "Actions": [
          "quicksight:DescribeDataSource",
          "quicksight:DescribeDataSourcePermissions",
          "quicksight:PassDataSource",
          "quicksight:UpdateDataSource",
          "quicksight:DeleteDataSource",
          "quicksight:UpdateDataSourcePermissions"
        ]
      }
    ]
}
```

If your data source uses the `redshift:GetClusterCredentialsWithIAM` permission and doesn't use the `DatabaseUser` or `DatabaseGroups` parameters, grant the role access to some or all tables in the schema. To see if a role has been granted `SELECT` permissions to a specific table, input the following command into the Amazon Redshift Query Editor.

```
SELECT
u.usename,
t.schemaname||'.'||t.tablename,
has_table_privilege(u.usename,t.tablename,'select') AS user_has_select_permission
FROM
pg_user u
CROSS JOIN
pg_tables t
WHERE
u.usename = 'IAMR:RoleName'
AND t.tablename = tableName
```

For more information about the `SELECT` action in the Amazon Redshift Query Editor, see [SELECT](https://docs.aws.amazon.com/redshift/latest/dg/r_SELECT_synopsis.html).

To grant `SELECT` permisions to the role, input the following command in the Amazon Redshift Query Editor.

```
GRANT SELECT ON { [ TABLE ] table_name [, ...] | ALL TABLES IN SCHEMA 
schema_name [, ...] } TO "IAMR:Rolename";
```

For more information about the `GRANT` action in the Amazon Redshift Query Editor, see [GRANT](https://docs.aws.amazon.com/redshift/latest/dg/r_GRANT.html).

## Run queries with Amazon S3 data sources
Amazon S3 data sources

Amazon S3 data sources contain a manifest file that Quick uses to find and parse your data. You can upload a JSON manifest file through the Quick console, or you can provide a URL that points to a JSON file in an S3 bucket. If you choose to provide a URL, Quick must be granted permission to access the file in Amazon S3. Use the Quick administration console to control access to the manifest file and the data that it references.

With the **RoleArn** property, you can grant access to the manifest file and the data that it references through a custom IAM role that overrides the account-wide role. Use the API to attach the ARN to the manifest file of the Amazon S3 data source. To do so, include the role ARN in the [RoleArn](https://docs.aws.amazon.com/quicksight/latest/APIReference/API_RoleArn.html) property of [S3Parameters](https://docs.aws.amazon.com/quicksight/latest/APIReference/API_S3Parameters.html). For verification, you can see the role ARN in the **Edit S3 data source** dialog box. However, **Role ARN** is a read-only field, as shown in the following screenshot.

To get started, create an Amazon S3 manifest file. Then, you can either upload it to Amazon Quick when you create a new Amazon S3 dataset or place the file into the Amazon S3 bucket that contains your data files. View the following example to see what a manifest file might look like:

```
{
    "fileLocations": [
        {
            "URIPrefixes": [
                "s3://quicksightUser-run-as-role/data/"
            ]
        }
    ],
    "globalUploadSettings": {
        "format": "CSV",
        "delimiter": ",",
        "textqualifier": "'",
        "containsHeader": "true"
    }
}
```

For instructions on how to create a manifest file, see [Supported formats for Amazon S3 manifest files](supported-manifest-file-format.md).

After you have created a manifest file and added it to your Amazon S3 bucket or uploaded it to Quick, create or update an existing role in IAM that grants `s3:GetObject` access. The following example illustrates how to update an existing IAM role with the AWS API:

```
aws iam put-role-policy \
    --role-name QuickSightAccessToS3RunAsRoleBucket \
    --policy-name GrantS3RunAsRoleAccess \
    --policy-document '{
        "Version": "2012-10-17"		 	 	 ,
        "Statement": [
            {
                "Effect": "Allow",
                "Action": "s3:ListBucket",
                "Resource": "arn:aws:s3:::s3-bucket-name"
            },
            {
                "Effect": "Allow",
                "Action": "s3:GetObject",
                "Resource": "arn:aws:s3:::s3-bucket-name/manifest.json"
            },
            {
                "Effect": "Allow",
                "Action": "s3:GetObject",
                "Resource": "arn:aws:s3:::s3-bucket-name/*"
            }
        ]
    }'
```

After your policy grants `s3:GetObject` access, you can begin creating data sources that apply the updated `put-role-policy` to the Amazon S3 data source's manifest file.

```
aws quicksight create-data-source --aws-account-id 111222333444 --region us-west-2 --endpoint https://quicksight.us-west-2.quicksight.aws.com/ \
    --data-source-id "s3-run-as-role-demo-source" \
    --cli-input-json '{
        "Name": "S3 with a custom Role",
        "Type": "S3",
        "DataSourceParameters": {
            "S3Parameters": {
                "RoleArn": "arn:aws:iam::111222333444:role/QuickSightAccessRunAsRoleBucket",
                "ManifestFileLocation": {
                    "Bucket": "s3-bucket-name", 
                    "Key": "manifest.json"
                }
            }
        }
    }'
```

After you verify your permissions, you can use the role in Quick data sources, either by creating a new role or updating an existing role. When using these commands, be sure to update the AWS account ID and AWS Region to match your own. 

# Deleting datasets


**Important**  
Currently, deleting a dataset is irreversible and can cause irreversible loss of work. Deletes don't cascade to delete dependent objects. Instead, dependent objects stop working, even if you replace the deleted dataset with an identical dataset. 

Before you delete a dataset, we strongly recommend that you first point each dependent analysis or dashboard to a new dataset. 

Currently, when you delete a dataset while dependent visuals still exist, the analyses and dashboards that contain those visuals have no way to assimilate new metadata. They remain visible, but they can't function. They can't be repaired by adding an identical dataset. 

This is because datasets include metadata that is integral to the analyses and dashboards that depend on that dataset. This metadata is uniquely generated for each dataset. Although the Quick Sight engine can read the metadata, it isn't readable by humans (for example, it doesn't contain field names). So, an exact replica of the dataset has different metadata. Each dataset's metadata is unique, even for multiple datasets that share the same name and the same fields.

**To delete a dataset**

1. Make sure that the dataset isn't being used by any analysis or dashboard that someone wants to keep using.

   On the **Data** page, choose the dataset that you no longer need. Then choose **Delete Dataset** at upper-right. 

1. If you receive a warning if this dataset is in use, track down all dependent analyses and dashboards and point them at a different dataset. If this isn't feasible, try one or more of these best practices instead of deleting it:
   + Rename the dataset, so that the dataset is clearly deprecated.
   + Filter the data, so that the dataset has no rows.
   + Remove everyone else's access to the dataset.

   We recommend that you use whatever means you can to inform owners of dependent objects that this dataset is being deprecated. Also, make sure that you provide sufficient time for them to take action.

1. After you make sure that there are no dependent objects that will stop functioning after the dataset is deleted, choose the dataset and choose **Delete Data Set**. Confirm your choice, or choose **Cancel**.

**Important**  
Currently, deleting a dataset is irreversible and can cause irreversible loss of work. Deletes don't cascade to delete dependent objects. Instead, dependent objects stop working, even if you replace the deleted dataset with an identical dataset. 

# Adding a dataset to an analysis


After you have created an analysis, you can add more datasets to the analysis. Then, you can use them to create more visuals. 

From within the analysis, you can open any dataset for editing, for example to add or remove fields, or perform other data preparation. You can also remove or replace data sets. 

The currently selected dataset displays at the top of the **Data** pane. This is the dataset that is used by the currently selected visual. Each visual can use only one dataset. Choosing a different visual changes the selected dataset to the one used by that visual.

To change the selected dataset manually, choose the dataset list at the top of the **Data** pane and then choose a different dataset. This deselects the currently selected visual if it doesn't use this dataset. Then, choose a visual that uses the selected dataset. Or choose **Add** in the **Visuals** pane to create a new visual using the selected dataset.

If you choose **Suggested** on the tool bar to see suggested visuals, you'll see visuals based on the currently selected dataset.

Only filters for the currently selected dataset are shown in the **Filter** pane, and you can only create filters on the currently selected dataset. 

**Topics**
+ [

# Replacing datasets
](replacing-data-sets.md)
+ [

# Remove a dataset from an analysis
](delete-a-data-set-from-an-analysis.md)

Use the following procedure to add a dataset to an analysis or edit a dataset used by an analysis.

**To add a dataset to an analysis**

1. On the analysis page, navigate to the **Data** pane and expand the **Dataset** dropdown.

1. Choose **Add a new dataset** to add a dataset. Or, choose **Manage datasets** to edit a dataset. For more information about editing a dataset, see [Editing datasets](edit-a-data-set.md). 

1. A list of your datasets appears. Choose a dataset and then choose **Select**. To cancel, choose **Cancel**.

# Replacing datasets


In an analysis, you can add, edit, replace, or remove datasets. Use this section to learn how to replace your dataset. 

When you replace a dataset, the new dataset should have similar columns, if you expect the visual to work the way you designed it. Replacing the dataset also clears the undo and redo history for the analysis. This means you can't use the undo and redo buttons on the application bar to navigate your changes. So, when you decide to change the dataset, your analysis design should be somewhat stable—not in the middle of an editing phase.

**To replace a dataset**

1. On the analysis page, navigate to the **Data** pane and expand the **Dataset** dropdown.

1. Choose **Manage datasets**.

1. Choose the ellipsis (three dots) next to the dataset that you want to replace, and then choose **Replace**.

1. In the **Select replacement dataset** page, choose a dataset from the list, and then choose **Select**.
**Note**  
Replacing a dataset clears the undo and redo history for this analysis. 

The dataset is replaced with the new one. The field list and visuals are updated with the new dataset. 

At this point, you can choose to add a new dataset, edit the new dataset, or replace it with a different one. Choose **Close** to exit. 

## If your new dataset doesn't match


In some cases, the selected replacement dataset doesn't contain all of the fields and hierarchies used by the visuals, filters, parameters, and calculated fields in your analysis. If so, you receive a warning from Quick Sight that shows a list of mismatched or missing columns. 

If this happens, you can update the field mapping between the two datasets. 

**To update the field mapping**

1. In the **Mismatch in replacement dataset** page, choose **Update field mapping**.

1. In the **Update field mapping** page, choose the drop-down menu for the field(s) you want to map and choose a field from the list to map it to.

   If the field is missing from the new dataset, choose **Ignore this field**.

1. Choose **Confirm** to confirm your updates.

1. Choose **Close** to close the page and return to your analysis.

The dataset is replaced with the new one. The fields list and visuals are updated with the new dataset.

Any visuals that were using a field that's now missing from the new dataset update to blank. You can readd fields to the visual or remove the visual from your analysis.

If you change your mind after replacing the dataset, you can still recover. Let's say you replace the dataset and then find that it's too difficult to change your analysis to match the new dataset. You can undo any changes you made to your analysis. You can then replace the new dataset with the original one, or with a dataset that more closely matches the requirements of the analysis. 

# Remove a dataset from an analysis


Use the following procedure to delete a dataset from an analysis.

**To delete a dataset from an analysis**

1. On the analysis page, navigate to the **Data** pane and expand the **Dataset** dropdown.

1. Choose **Manage datasets**.

1. Choose the ellipsis (three dots) next to the dataset that you want to replace, and then choose **Remove**. You can't delete a dataset if it's the only one in the analysis.

1. Choose **Close** to close the dialog box.