

# Connecting to data in Amazon Quick Sight
Connecting to data

People in many different roles use Amazon Quick Sight to help them do analysis and advanced calculations, design data dashboards, embed analytics, and make better-informed decisions. Before any of that can happen, someone who understands your data needs to add it to a [Quick Sight dataset](https://docs.aws.amazon.com/quicksuite/latest/userguide/creating-data-sets). Quick Sight supports direct connections and uploads from a variety of [data sources](https://docs.aws.amazon.com/quicksuite/latest/userguide/working-with-data-sources).**Capabilities and use cases**

**Amazon Quick Standard edition capabilities**  
After your data is available in Quick Standard edition, you can do the following:  
+ Transform the dataset with field formatting, hierarchies, data type conversions, and calculations.
+ Create one or more data analyses based on your newly created dataset.
+ Share your analysis with other people so they can help design it.
+ Add charts, graphs, more datasets, and multiple pages (called sheets) to your data analysis.
+ Create visual appeal with customized formatting and themes.
+ Make them interactive by using parameters, controls, filters, and custom actions.
+ Combine data from multiple data sources, and then build new hierarchies for drilling down and calculations only available during analytics, like aggregations, window functions, and more.
+ Publish your analysis as an interactive data dashboard.
+ Share the dashboard so other people can use the dashboard, even if they don't use the analysis that it's based on.
+ Add more data to create more analyses and dashboards.

**Amazon Quick** **Enterprise edition capabilities**  
After your data is available in Quick Enterprise edition, you can do different things depending on your role. If you can build datasets, design analyses, and publish dashboards, you can do all of the things people using Standard edition can do.   
In addition, these are some examples of additional tasks that you can do:  
+ Create analyses that use Quick Sight insights, including machine learning (ML) powered insights for forecasting, anomaly and outlier detection, and key driver identification.
+ Design narrative insights with text, colors, images, and calculations.
+ Add data from virtual private clouds (VPCs) and on-premises data sources, with data encryption at rest.
+ Control access in datasets by adding row and column level security.
+ Refresh imported datasets every hour.
+ Share emailed reports.

**Application development**  
If you develop applications or use the AWS SDKs and AWS Command Line Interface (AWS CLI), you can do the following and more:  
+ Add embedded analytics and embedded interactive dashboards to websites and applications.
+ Use API operations to manage data sources and datasets.
+ Refresh imported data more frequently by using the data ingestion API operations.
+ Script, transfer, and make templates from analyses and dashboards by using API operations.
+ Programmatically assign people to security roles based on settings managed by system administrators.

**Administrative functions in Quick**  
If you perform administrative functions in Quick, you can do the following and more:  
+ Manage security with shared folders to organize your teams' work and help them collaborate using dashboards, analytics, and datasets.
+ Add Quick Sight to your VPC to enable access to data in VPC and on-premises data sources.
+ Protect sensitive data with finely grained access control to AWS data sources.
+ Manually assign people to the Quick author security role so they can prepare datasets, design analytics, and publish data dashboards at a fixed cost per month.
+ Manually assign people to the Quick reader security role so they can securely interact with published data dashboards on a pay-per-session basis.

**Dashboard subscription**  
If you subscribe to dashboards, you can do the following:  
+ Use and subscribe to interactive dashboards designed by your team of experts.
+ Enjoy a simplified uncluttered interface.
+ View dashboard snapshots in email.
+ Focus on making decisions with the data at your fingertips.

After you connect to or import data, you create a dataset to shape and prepare data to share and reuse. You can view your available datasets on the **Data** page, which you reach by choosing **Data** on the Amazon Quick Sight start page. You can view available data sources and create a new dataset on the **Create a Data Set** page, which you reach by choosing **Create ** then **New dataset** on the **Data** page.

**Topics**
+ [

# Supported data sources
](supported-data-sources.md)
+ [

# Connect to your data with integrations and datasets
](connecting-to-data-examples.md)
+ [

# Data source quotas
](data-source-limits.md)
+ [

# Supported data types and values
](supported-data-types-and-values.md)
+ [

# Working with datasets
](working-with-datasets.md)
+ [

# Working with data sources in Amazon Quick Sight
](working-with-data-sources.md)

# Supported data sources


Amazon Quick Sight supports a variety of data sources that you can use to provide data for analyses. The following data sources are supported.

## Connecting to relational data


You can use any of the following relational data stores as data sources for Amazon Quick Sight:
+ Amazon Athena
+ Amazon Aurora
+ AWS Glue Data Catalog can be accessed using AWS Glue data catalog compatible services, such as Athena or Redshift Spectrum
+ Amazon OpenSearch Service
+ Amazon Redshift
+ Amazon Redshift Spectrum
+ Amazon S3
+ Amazon S3 Analytics
+ Apache Impala
+ Apache Spark 2.0 or later
+ AWS IoT Analytics
+ Databricks (E2 Platform only) on Spark 1.6 or later, up to version 3.0 
+ Exasol 7.1.2 or later
+ Google BigQuery
+ MariaDB 10.0 or later
+ Microsoft SQL Server 2012 or later
+ MySQL 5.7 or later
**Note**  
Effective October 2023, the MySQL community has deprecated support for MySQL version 5.7. This means that Amazon Quick Sight will no longer support new features, enhancements, bug fixes, or security patches for MySQL 5.7. Support for existing query workload will take place at a best effort basis. Quick Sight customers can still use MySQL 5.7 datasets with Quick Sight, but we encourage customers to upgrade their MySQL databases (DB) to major version 8.0 or higher. To see the statement provided by Amazon RDS, see [Amazon RDS Extended Support opt-in behavior is changing. Upgrade your Amazon RDS for MySQL 5.7 database instances before February 29, 2024 to avoid potential increase in charges](https://repost.aws/articles/ARHdQg4IelQS2uyXkNrINw-A/announcement-amazon-rds-extended-support-opt-in-behavior-is-changing-upgrade-your-amazon-rds-for-mysql-5-7-database-instances-before-february-29-2024-to-avoid-potential-increase-in-charges).  
Amazon RDS has updated their security settings for Amazon RDS MySQL 8.3. Any connections from Quick Sight to Amazon RDS MySQL 8.3 are SSL-enabled by default. This is the only option available for MySQL 8.3. connections.  
TLS 1.2 for MySQL connections requires MySQL version 5.7.28 or higher. For MySQL versions below 5.7.28, Quick Sight falls back to TLS 1.1. If your security requirements mandate TLS 1.2, ensure your MySQL or Aurora MySQL database is running version 5.7.28 or higher.
+ Oracle 12c or later
+ PostgreSQL 9.3.1 or later
**Note**  
SCRAM based authentication to PostgreSQL from Amazon Quick Sight is supported for the following connectors: RDS hosted PostgreSQL, Aurora PostgreSQL, and Vanilla PostgreSQL. If the appropriate PostgreSQL engine version is used, and the correct configurations in PostgreSQL for SCRAM are set up, no additional configurations are needed in Quick Sight. If you are still experiencing issues establishing a SCRAM authentication to PostgreSQL from Quick Sight, please create a support ticket.
+ Presto 0.167 or later
+ Snowflake
+ Starburst
+ Trino
+ Teradata 14.0 or later
+ Timestream

**Note**  
You can access additional data sources not listed here by linking or importing them through supported data sources.

Amazon Redshift clusters, Amazon Athena databases, and Amazon RDS instances must be in AWS. Other database instances must be in one of the following environments to be accessible from Amazon Quick Sight:
+ Amazon EC2
+ Local (on-premises) databases
+ Data in a data center or some other internet-accessible environment

For more information, see [Infrastructure security in Amazon Quick](infrastructure-and-network-access.md).

## Importing file data


You can use files in Amazon S3 or on your local (on-premises) network as data sources. Quick Sight supports files in the following formats:
+ CSV and TSV – Comma-delimited and tab-delimited text files
+ ELF and CLF – Extended and common log format files
+ JSON – Flat or semistructured data files
+ XLSX – Microsoft Excel files

Quick Sight supports UTF-8 file encoding, but not UTF-8 (with BOM).

Files in Amazon S3 that have been compressed with zip, or gzip ([www.gzip.org](http://www.gzip.org)), can be imported as-is. If you used another compression program for files in Amazon S3, or if the files are on your local network, remove compression before importing them.

### JSON data


Amazon Quick Sight natively supports JSON flat files and JSON semistructured data files.

You can either upload a JSON file or connect to your Amazon S3 bucket that contains JSON data. Amazon Quick Sight automatically performs schema and type inference on JSON files and embedded JSON objects. Then it flattens the JSON, so you can analyze and visualize application-generated data. 

Basic support for JSON flat-file data includes the following:
+ Inferring the schema
+ Determining data types
+ Flattening the data
+ Parsing JSON (JSON embedded objects) from flat files

Support for JSON file structures (.json) includes the following:
+ JSON records with structures
+ JSON records with root elements as arrays

You can also use the `parseJson` function to extract values from JSON objects in a text file. For example, if your CSV file has a JSON object embedded in one of the fields, you can extract a value from a specified key-value pair (KVP). For more information on how to do this, see [parseJson](parseJson-function.md).

The following JSON features aren't supported:
+ Reading JSON with a structure containing a list of records
+ List attributes and list objects within a JSON record; these are skipped during import
+ Customizing upload or configuration settings
+ parseJSON functions for SQL and analyses
+ Error messaging for invalid JSON
+ Extracting a JSON object from a JSON structure
+ Reading delimited JSON records

You can use the `parseJson` function to parse flat files during data preparation. This function extracts elements from valid JSON structures and lists.

The following JSON values are supported:
+ JSON object
+ String (double quoted)
+ Number (integer and float)
+ Boolean
+ NULL

## Software as a service (SaaS) data


Quick Sight can connect to a variety of Software as a Service (SaaS) data sources either by connecting directly or by using Open Authorization (OAuth).

SaaS sources that support direct connection include the following:
+ Jira
+ ServiceNow

SaaS sources that use OAuth require that you authorize the connection on the SaaS website. For this to work, Quick Sight must be able to access the SaaS data source over the network. These sources include the following:
+ Adobe Analytics
+ GitHub
+ Salesforce

  You can use reports or objects in the following editions of Salesforce as data sources for Amazon Quick Sight:
  + Enterprise Edition
  + Unlimited Edition
  + Developer Edition

## Local data sources


To connect to on premises data sources, you need to add your data sources and a Quick-specific network interface to Amazon Virtual Private Cloud (Amazon VPC). When configured properly, a VPC based on Amazon VPC resembles a traditional network that you operate in your own data center. It enables you to secure and isolate traffic between resources. You define and control the network elements to suit your requirements, while still getting the benefit of cloud networking and the scalable infrastructure of AWS.

For detailed information, see [Infrastructure security in Amazon Quick](infrastructure-and-network-access.md).

# Connect to your data with integrations and datasets
Connect to your data

You can connect Amazon Quick Sight to different types of data sources. This includes data residing in Software-as-a-Service (SaaS) applications, flat files stored in Amazon S3 buckets, data from third-party services like Salesforce, and query results from Athena. Use the following examples to learn more about the requirements for connecting to specific data sources. 

**Topics**
+ [

# Creating a dataset using Amazon Athena data
](create-a-data-set-athena.md)
+ [

# Using Amazon OpenSearch Service with Amazon Quick Sight
](connecting-to-os.md)
+ [

# Creating a dataset using Amazon S3 files
](create-a-data-set-s3.md)
+ [

# Creating a data source using Apache Spark
](create-a-data-source-spark.md)
+ [

# Using Databricks in Quick Sight
](quicksight-databricks.md)
+ [

# Creating a dataset using Google BigQuery
](quicksight-google-big-query.md)
+ [

# Creating a dataset using a Google Sheets data source
](create-a-dataset-google-sheets.md)
+ [

# Creating a dataset using an Apache Impala data source
](create-a-dataset-impala.md)
+ [

# Creating a dataset using a Microsoft Excel file
](create-a-data-set-excel.md)
+ [

# Creating a data source using Presto
](create-a-data-source-presto.md)
+ [

# Using Snowflake with Amazon Quick Sight
](connecting-to-snowflake.md)
+ [

# Using Starburst with Amazon Quick Sight
](connecting-to-starburst.md)
+ [

# Creating a data source and data set from SaaS sources
](connecting-to-saas-data-sources.md)
+ [

# Creating a dataset from Salesforce
](create-a-data-set-salesforce.md)
+ [

# Using Trino with Amazon Quick Sight
](connecting-to-trino.md)
+ [

# Creating a dataset using a local text file
](create-a-data-set-file.md)
+ [

# Using Amazon Timestream data with Amazon Quick Sight
](using-data-from-timestream.md)

# Creating a dataset using Amazon Athena data
Amazon Athena

Use the following procedure to create a new dataset that connects to Amazon Athena data or to Athena Federated Query data.

**To connect to Amazon Athena**

1. Begin by creating a new dataset. Choose **Data** from the navigation pane at left.

1. Choose **Create**, then choose **New dataset**.

1. 

   1. To use an existing Athena connection profile (common), choose the card for the existing data source that you want to use. Choose **Select**. 

      Cards are labeled with the Athena data source icon and the name provided by the person who created the connection.

   1. To create a new Athena connection profile (less common), use the following steps:

      1. Choose **New data source**, then choose the **Athena** data source card.

      1. Choose **Next**.

      1. For **Data source name**, enter a descriptive name.

      1. For **Athena workgroup**, choose your workgroup.

      1. Choose **Validate connection** to test the connection.

      1. Choose **Create data source**.

      1. (Optional) Select an IAM role ARN for queries to run as. 

1. On the **Choose your table** screen, do the following:

   1. For **Catalog**, choose one of the following:
      + If you are using Athena Federated Query, choose the catalog you want to use.
      + Otherwise, choose **AwsDataCatalog**.

   1. Choose one of the following:
      + To write a SQL query, choose **Use custom SQL**. 
      + To choose a database and table, choose your catalog that contains your databases from the dropdown under **Catalog**. Then, choose a database from the dropdown under **Database** and choose a table from the **Tables** list that appears for your database.

   If you don't have the right permissions, you receive the following error message: "You don't have sufficient permissions to connect to this dataset or run this query." Contact your Quick administrator for assistance. For more information, see [Authorizing connections to Amazon Athena](athena.md). 

1. Choose **Edit/preview data**. 

1. Create a dataset and analyze the data using the table by choosing **Visualize**. For more information, see [Analyses and reports: Visualizing data in Amazon Quick Sight](working-with-visuals.md). 

# Using Amazon OpenSearch Service with Amazon Quick Sight
Amazon OpenSearch Service

Following, you can find how to connect to your Amazon OpenSearch Service data using Amazon Quick Sight.

## Creating a new Quick Sight data source connection for OpenSearch Service
Creating a data source connection for Amazon OpenSearch Service

Following, you can find how to connect to OpenSearch Service

Before you can proceed, Amazon Quick Sight needs to be authorized to connect to Amazon OpenSearch Service. If connections aren't enabled, you get an error when you try to connect. A Quick Sight administrator can authorize connections to AWS resources. 

**To authorize Quick Sight to initiate a connection to OpenSearch Service**

1. Open the menu by clicking on your profile icon at top right, then choose **Manage Quick**. If you don't see the **Manage Quick** option on your profile menu, ask your Amazon Quick administrator for assistance.

1. Choose **Security & permissions**, **Add or remove**.

1. Enable the option for **OpenSearch**.

1. Choose **Update**.

After OpenSearch Service is accessible, you create a data source so people can use the specified domains.

**To connect to OpenSearch Service**

1. Begin by creating a new dataset. Choose **Data** from the navigation pane at left, then choose **Create** and **New Dataset**.

1. Choose the **Amazon OpenSearch** data source card.

1. For **Data source name**, enter a descriptive name for your OpenSearch Service data source connection, for example `OpenSearch Service ML Data`. Because you can create many datasets from a connection to OpenSearch Service, it's best to keep the name simple.

1. For **Connection type**, choose the network you want to use. This can be a virtual private cloud (VPC) based on Amazon VPC or a public network. The list of VPCs contains the names of VPC connections, rather than VPC IDs. These names are defined by the Quick administrator. 

1. For **Domain**, choose the OpenSearch Service domain that you want to connect to. 

1. Choose **Validate connection** to check that you can successfully connect to OpenSearch Service.

1. Choose **Create data source** to proceed.

1. For **Tables**, choose the one you want to use, then choose **Select** to continue. 

1. Do one of the following:
   + To import your data into the Quick Sight in-memory engine (called SPICE), choose **Import to SPICE for quicker analytics**. For information about how to enable importing OpenSearch data, see [Authorizing connections to Amazon OpenSearch Service](opensearch.md).
   + To allow Quick Sight to run a query against your data each time you refresh the dataset or use the analysis or dashboard, choose **Directly query your data**. 

     To enable autorefresh on a published dashboard that uses OpenSearch Service data, the OpenSearch Service dataset needs to use a direct query.

1. Choose **Edit/Preview** and then **Save** to save your dataset and close it.

## Managing permissions for OpenSearch Service data
Managing OpenSearch Service permissions

The following procedure describes how to view, add, and revoke permissions to allow access to the same OpenSearch Service data source. The people that you add need to be active users in Quick Sight before you can add them. 

**To edit permissions on a data source**

1. Choose **Data** at left, then scroll down to find the data source card for your Amazon OpenSearch Service connection. An example might be `US Amazon OpenSearch Service Data`.

1. Choose the **Amazon OpenSearch** dataset.

1. On the dataset details page that opens, choose the **Permissions** tab.

   A list of current permissions appears.

1. To add permissions, choose **Add users & groups**, then follow these steps:

   1. Add users or groups to allow them to use the same dataset.

   1. When you're finished adding everyone that you want to add, choose the **Permissions** that you want to apply to them.

1. (Optional) To edit permissions, you can choose **Viewer** or **Owner**. 
   + Choose **Viewer** to allow read access.
   + Choose **Owner** to allow that user to edit, share, or delete this Quick Sight dataset. 

1. (Optional) To revoke permissions, choose **Revoke access**. After you revoke someone's access, they can't create new datasets from this data source. However, their existing datasets still have access to this data source.

1. When you are finished, choose **Close**.

## Adding a new Quick Sight dataset for OpenSearch Service
Adding a new Quick Sight dataset for OpenSearch Service

After you have an existing data source connection for OpenSearch Service, you can create OpenSearch Service datasets to use for analysis. 

**To create a dataset using OpenSearch Service**

1. From the start page, choose **Data**, **Create**, **New dataset**.

1. Scroll down to the data source card for your OpenSearch Service connection. If you have many data sources, you can use the search bar at the top of the page to find your data source with a partial match on the name.

1. Choose the **Amazon OpenSearch** data source card, and then choose **Create data set**.

1. For **Tables**, choose the OpenSearch Service index that you want to use.

1. Choose **Edit/Preview**.

1. Choose **Save** to save and close the dataset. 

## Adding OpenSearch Service data to an analysis


After you have an OpenSearch Service dataset available, you can add it to a Quick Sight analysis. Before you begin, make sure that you have an existing dataset that contains the OpenSearch Service data that you want to use.

**To add OpenSearch Service data to an analysis**

1. Choose **Analyses** at left.

1. Do one of the following:
   + To create a new analysis, choose **New analysis** at right. 
   + To add to an existing analysis, open the analysis that you want to edit. 
     + Choose the pencil icon near at top left.
     + Choose **Add data set**.

1. Choose the OpenSearch Service dataset that you want to add. 

   For information on using OpenSearch Service in visualizations, see [Limitations for using OpenSearch Service](#limitations-for-es). 

1. For more information, see [Working with analyses](https://docs.aws.amazon.com/quicksight/latest/user/working-with-analyses.html).

## Limitations for using OpenSearch Service


The following limitations apply to using OpenSearch Service datasets:
+ OpenSearch Service datasets support a subset of the visual types, sort options, and filter options.
+ To enable autorefresh on a published dashboard that uses OpenSearch Service data, the OpenSearch Service dataset needs to use a direct query.
+ Multiple subquery operations aren't supported. To avoid errors during visualization, don't add multiple fields to a field well, use one or two fields per visualization, and avoid using the **Color** field well.
+ Custom SQL isn't supported.
+ Crossdataset joins and self joins aren't supported.
+ Calculated fields aren't supported. 
+ Text fields aren't supported. 
+ The "other" category isn't supported. If you use an OpenSearch Service dataset with a visualization that supports the "other" category, disable the "other" category by using the menu on the visual. 

# Creating a dataset using Amazon S3 files
Amazon S3 files

To create a dataset using one or more text files (.csv, .tsv, .clf, or .elf) from Amazon S3, create a manifest for Quick Sight. Quick Sight uses this manifest to identify the files that you want to use and to the upload settings needed to import them. When you create a dataset using Amazon S3, the file data is automatically imported into [SPICE](spice.md).

You must grant Quick Sight access to any Amazon S3 buckets that you want to read files from. For information about granting Quick Sight access to AWS resources, see [Configuring Amazon Quick Sight access to AWS data sources](access-to-aws-resources.md).

**Topics**
+ [

# Supported formats for Amazon S3 manifest files
](supported-manifest-file-format.md)
+ [

# Creating Amazon S3 datasets
](create-a-data-set-s3-procedure.md)
+ [

# Datasets using S3 files in another AWS account
](using-s3-files-in-another-aws-account.md)

# Supported formats for Amazon S3 manifest files


You use JSON manifest files to specify files in Amazon S3 to import into Quick Sight. These JSON manifest files can use either the Quick Sight format described following or the Amazon Redshift format described in [Using a manifest to specify data files](https://docs.aws.amazon.com/redshift/latest/dg/loading-data-files-using-manifest.html) in the *Amazon Redshift Database Developer Guide*. You don't have to use Amazon Redshift to use the Amazon Redshift manifest file format. 

If you use an Quick Sight manifest file, it must have a .json extension, for example `my_manifest.json`. If you use an Amazon Redshift manifest file, it can have any extension. 

If you use an Amazon Redshift manifest file, Quick Sight processes the optional `mandatory` option as Amazon Redshift does. If the associated file isn't found, Quick Sight ends the import process and returns an error. 

Files that you select for import must be delimited text (for example, .csv or .tsv), log (.clf), or extended log (.elf) format, or JSON (.json). All files identified in one manifest file must use the same file format. Plus, they must have the same number and type of columns. Quick Sight supports UTF-8 file encoding, but not UTF-8 with byte-order mark (BOM). If you are importing JSON files, then for `globalUploadSettings` specify `format`, but not `delimiter`, `textqualifier`, or `containsHeader`.

Make sure that any files that you specify are in Amazon S3 buckets that you have granted Quick Sight access to. For information about granting Quick Sight access to AWS resources, see [Configuring Amazon Quick Sight access to AWS data sources](access-to-aws-resources.md).

## Manifest file format for Quick Sight


Quick Sight manifest files use the following JSON format.

```
{
    "fileLocations": [
        {
            "URIs": [
                "uri1",
                "uri2",
                "uri3"
            ]
        },
        {
            "URIPrefixes": [
                "prefix1",
                "prefix2",
                "prefix3"
            ]
        }
    ],
    "globalUploadSettings": {
        "format": "JSON",
        "delimiter": ",",
        "textqualifier": "'",
        "containsHeader": "true"
    }
}
```

Use the fields in the `fileLocations` element to specify the files to import, and the fields in the `globalUploadSettings` element to specify import settings for those files, such as field delimiters. 

The manifest file elements are described following:
+ **fileLocations** – Use this element to specify the files to import. You can use either or both of the `URIs` and `URIPrefixes` arrays to do this. You must specify at least one value in one or the other of them.
  + **URIs** – Use this array to list URIs for specific files to import.

    Quick Sight can access Amazon S3 files that are in any AWS Region. However, you must use a URI format that identifies the AWS Region of the Amazon S3 bucket if it's different from that used by your Quick account.

    URIs in the following formats are supported.  
****    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-manifest-file-format.html)
  + **URIPrefixes** – Use this array to list URI prefixes for S3 buckets and folders. All files in a specified bucket or folder are imported. Quick Sight recursively retrieves files from child folders.

    Quick Sight can access Amazon S3 buckets or folders that are in any AWS Region. Make sure to use a URI prefix format that identifies the S3 bucket's AWS Region if it's different from that used by your Quick account.

    URI prefixes in the following formats are supported.  
****    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-manifest-file-format.html)
+ **globalUploadSettings** – (Optional) Use this element to specify import settings for the Amazon S3 files, such as field delimiters. If this element is not specified, Quick Sight uses the default values for the fields in this section.
**Important**  
For log (.clf) and extended log (.elf) files, only the **format** field in this section is applicable, so you can skip the other fields. If you choose to include them, their values are ignored. 
  + **format** – (Optional) Specify the format of the files to be imported. Valid formats are **CSV**, **TSV**, **CLF**, **ELF**, and **JSON**. The default value is **CSV**.
  + **delimiter** – (Optional) Specify the file field delimiter. This must map to the file type specified in the `format` field. Valid formats are commas (**,**) for .csv files and tabs (**\$1t**) for .tsv files. The default value is comma (**,**).
  + **textqualifier** – (Optional) Specify the file text qualifier. Valid formats are single quote (**'**), double quotes (**\$1"**). The leading backslash is a required escape character for a double quote in JSON. The default value is double quotes (**\$1"**). If your text doesn't need a text qualifier, don't include this property.
  + **containsHeader** – (Optional) Specify whether the file has a header row. Valid formats are **true** or **false**. The default value is **true**.

### Manifest file examples for Quick Sight


The following are some examples of completed Quick Sight manifest files.

The following example shows a manifest file that identifies two specific .csv files for import. These files use double quotes for text qualifiers. The `format`, `delimiter`, and `containsHeader` fields are skipped because the default values are acceptable.

```
{
    "fileLocations": [
        {
            "URIs": [
                "https://yourBucket.s3.amazonaws.com/data-file.csv",
                "https://yourBucket.s3.amazonaws.com/data-file-2.csv"
            ]
        }
    ],
    "globalUploadSettings": {
        "textqualifier": "\""
    }
}
```

The following example shows a manifest file that identifies one specific .tsv file for import. This file also includes a bucket in another AWS Region that contains additional .tsv files for import. The `textqualifier` and `containsHeader` fields are skipped because the default values are acceptable.

```
{
    "fileLocations": [
        {
            "URIs": [
                "https://s3.amazonaws.com/amzn-s3-demo-bucket/data.tsv"
            ]
        },
        {
            "URIPrefixes": [
                "https://s3-us-east-1.amazonaws.com/amzn-s3-demo-bucket/"
            ]
        }
    ],
    "globalUploadSettings": {
        "format": "TSV",
        "delimiter": "\t"
    }
}
```

The following example identifies two buckets that contain .clf files for import. One is in the same AWS Region as the Quick account, and one in a different AWS Region. The `delimiter`, `textqualifier`, and `containsHeader` fields are skipped because they are not applicable to log files.

```
{
    "fileLocations": [
        {
            "URIPrefixes": [
                "https://amzn-s3-demo-bucket1.your-s3-url.com",
                "s3://amzn-s3-demo-bucket2/"
            ]
        }
    ],
    "globalUploadSettings": {
        "format": "CLF"
    }
}
```

The following example uses the Amazon Redshift format to identify a .csv file for import.

```
{
    "entries": [
        {
            "url": "https://amzn-s3-demo-bucket.your-s3-url.com/myalias-test/file-to-import.csv",
            "mandatory": true
        }
    ]
}
```

The following example uses the Amazon Redshift format to identify two JSON files for import.

```
{
    "fileLocations": [
        {
            "URIs": [
                "https://yourBucket.s3.amazonaws.com/data-file.json",
                "https://yourBucket.s3.amazonaws.com/data-file-2.json"
            ]
        }
    ],
    "globalUploadSettings": {
        "format": "JSON"
    }
}
```

# Creating Amazon S3 datasets


**To create an Amazon S3 dataset**

1. Check [Data source quotas](data-source-limits.md) to make sure that your target file set doesn't exceed data source quotas.

1. Create a manifest file to identify the text files that you want to import, using one of the formats specified in [Supported formats for Amazon S3 manifest files](supported-manifest-file-format.md).

1. Save the manifest file to a local directory, or upload it into Amazon S3.

1. On the Quick start page, choose **Data**.

1. On the **Data** page, choose **Create** then **New dataset**.

1. Choose the Amazon S3 icon and then choose **Next**.

1. For **Data source name**, enter a description of the data source. This name should be something that helps you distinguish this data source from others.

1. For **Upload a manifest file**, do one of the following:
   + To use a local manifest file, choose **Upload**, and then choose **Upload a JSON manifest file**. For **Open**, choose a file, and then choose **Open**.
   + To use a manifest file from Amazon S3, choose **URL**, and enter the URL for the manifest file. To find the URL of a pre-existing manifest file in the Amazon S3 console, navigate to the appropriate file and choose it. A properties panel displays, including the link URL. You can copy the URL and paste it into Quick Sight.

1. Choose **Connect**.

1. To make sure that the connection is complete, choose **Edit/Preview data**. Otherwise, choose **Visualize** to create an analysis using the data as-is. 

   If you choose **Edit/Preview data**, you can specify a dataset name as part of preparing the data. Otherwise, the dataset name matches the name of the manifest file. 

   To learn more about data preparation, see [Preparing data in Amazon Quick Sight](preparing-data.md).

## Creating datasets based on multiple Amazon S3 files


You can use one of several methods to merge or combine files from Amazon S3 buckets inside Quick Sight:
+ **Combine files by using a manifest** – In this case, the files must have the same number of fields (columns). The data types must match between fields in the same position in the file. For example, the first field must have the same data type in each file. The same goes for the second field, and the third field, and so on. Quick Sight takes field names from the first file.

  The files must be listed explicitly in the manifest. However, they don't have to be inside the same Amazon S3 bucket.

  In addition, the files must follow the rules described in [Supported formats for Amazon S3 manifest files](supported-manifest-file-format.md).

  For more details about combining files using a manifest, see [Creating a dataset using Amazon S3 files](create-a-data-set-s3.md).
+ **Merge files without using a manifest** – To merge multiple files into one without having to list them individually in the manifest, you can use Athena. With this method, you can simply query your text files, like they are in a table in a database. For more information, see the post [Analyzing data in Amazon S3 using Athena](https://aws.amazon.com/blogs/big-data/analyzing-data-in-s3-using-amazon-athena/) in the Big Data blog. 
+ **Use a script to append files before importing** – You can use a script designed to combine your files before uploading. 

# Datasets using S3 files in another AWS account
Using another account's S3 files

Use this section to learn how to set up security so you can use Quick Sight to access Amazon S3 files in another AWS account. 

For you to access files in another account, the owner of the other account must first set Amazon S3 to grant you permissions to read the file. Then, in Quick Sight, you must set up access to the buckets that were shared with you. After both of these steps are finished, you can use a manifest to create a dataset.

**Note**  
 To access files that are shared with the public, you don't need to set up any special security. However, you still need a manifest file.

**Topics**
+ [

## Setting up Amazon S3 to allow access from a different Quick account
](#setup-S3-to-allow-access-from-a-different-quicksight-account)
+ [

## Setting up Quick Sight to access Amazon S3 files in another AWS account
](#setup-quicksight-to-access-S3-in-a-different-account)

## Setting up Amazon S3 to allow access from a different Quick account
Setting up Amazon S3 to allow a different account

Use this section to learn how to set permissions in Amazon S3 files so they can be accessed by Quick Sight in another AWS account. 

For information on accessing another account's Amazon S3 files from your Quick Sight account, see [Setting up Quick Sight to access Amazon S3 files in another AWS account](#setup-quicksight-to-access-S3-in-a-different-account). For more information about S3 permissions, see [Managing access permissions to your Amazon S3 resources](https://docs.aws.amazon.com/AmazonS3/latest/dev/s3-access-control.html) and [How do I set permissions on an object?](https://docs.aws.amazon.com/AmazonS3/latest/user-guide/set-object-permissions.html)

You can use the following procedure to set this access from the S3 console. Or you can grant permissions by using the AWS CLI or by writing a script. If you have a lot of files to share, you can instead create an S3 bucket policy on the `s3:GetObject` action. To use a bucket policy, add it to the bucket permissions, not to the file permissions. For information on bucket policies, see [Bucket policy examples](https://docs.aws.amazon.com/AmazonS3/latest/dev/example-bucket-policies.html) in the *Amazon S3 Developer Guide. *

**To set access from a different Quick account from the S3 console**

1. Get the email address of the AWS account email that you want to share with. Or you can get and use the canonical user ID. For more information on canonical user IDs, see [AWS account identifiers](https://docs.aws.amazon.com/general/latest/gr/acct-identifiers.html) in the *AWS General Reference.*

1. Sign in to the AWS Management Console and open the Amazon S3 console at [https://console.aws.amazon.com/s3/](https://console.aws.amazon.com/s3/).

1. Find the Amazon S3 bucket that you want to share with Quick Sight. Choose **Permissions**.

1. Choose **Add Account**, and then enter an email address, or paste in a canonical user ID, for the AWS account that you want to share with. This email address should be the primary one for the AWS account. 

1. Choose **Yes** for both **Read bucket permissions** and **List objects**.

   Choose **Save** to confirm.

1. Find the file that you want to share, and open the file's permission settings. 

1. Enter an email address or the canonical user ID for the AWS account that you want to share with. This email address should be the primary one for the AWS account. 

1. Enable **Read object** permissions for each file that Quick Sight needs access to. 

1. Notify the Quick user that the files are now available for use.

## Setting up Quick Sight to access Amazon S3 files in another AWS account
Setting up Quick Sight to access another Amazon S3 account

Use this section to learn how to set up Quick Sight so you can access Amazon S3 files in another AWS account. For information on allowing someone else to access your Amazon S3 files from their Quick account, see [Setting up Amazon S3 to allow access from a different Quick account](#setup-S3-to-allow-access-from-a-different-quicksight-account).

Use the following procedure to access another account's Amazon S3 files from Quick Sight. Before you can use this procedure, the users in the other AWS account must share the files in their Amazon S3 bucket with you.

**To access another account's Amazon S3 files from Quick Sight**

1. Verify that the user or users in the other AWS account gave your account read and write permission to the S3 bucket in question. 

1. Choose your profile icon, and then choose **Manage Quick Sight**.

1. Choose **Security & permissions**.

1. Under **Quick Sight access to AWS services**, choose **Manage**.

1. Choose **Select S3 buckets**.

1. On the **Select Amazon S3 buckets** screen, choose the **S3 buckets you can access across AWS** tab.

   The default tab is named **S3 buckets linked to Quick Sight account**. It shows all the buckets your Quick account has access to. 

1. Do one of the following:
   + To add all the buckets that you have permission to use, choose **Choose accessible buckets from other AWS accounts**. 
   + If you have one or more Amazon S3 buckets that you want to add, enter their names. Each must exactly match the unique name of the Amazon S3 bucket.

     If you don't have the appropriate permissions, you see the error message "We can't connect to this S3 bucket. Make sure that any S3 buckets you specify are associated with the AWS account used to create this Quick account." This error message appears if you don't have either account permissions or Quick Sight permissions.
**Note**  
To use Amazon Athena, Quick Sight needs to access the Amazon S3 buckets that Athena uses.   
You can add them here one by one, or use the **Choose accessible buckets from other AWS accounts** option.

1. Choose **Select buckets** to confirm your selection. 

1. Create a new dataset based on Amazon S3, and upload your manifest file. For more information Amazon S3 datasets, see [Creating a dataset using Amazon S3 files](create-a-data-set-s3.md).

# Creating a data source using Apache Spark
Apache Spark

You can connect directly to Apache Spark using Quick Sight, or you can connect to Spark through Spark SQL. Using the results of queries, or direct links to tables or views, you create data sources in Quick Sight. You can either directly query your data through Spark, or you can import the results of your query into [SPICE](spice.md).

Before you use Quick Sight with Spark products, you must configure Spark for Quick Sight. 

Quick Sight requires your Spark server to be secured and authenticated using LDAP, which is available to Spark version 2.0 or later. If Spark is configured to allow unauthenticated access, Quick Sight refuses the connection to the server. To use Quick Sight as a Spark client, you must configure LDAP authentication to work with Spark. 

The Spark documentation contains information on how to set this up. To start, you need to configure it to enable front-end LDAP authentication over HTTPS. For general information on Spark, see [the Apache spark website](http://spark.apache.org/). For information specifically on Spark and security, see [Spark security documentation](http://spark.apache.org/docs/latest/security.html). 

To make sure that you have configured your server for Quick Sight access, follow the instructions in [Network and database configuration requirements](configure-access.md).

# Using Databricks in Quick Sight
Databricks

Use this section to learn how to connect from Quick Sight to Databricks. 

**To connect to Databricks**

1. Begin by creating a new dataset. Choose **Data** from the navigation pane at left.

1. Choose **Create** then **New Dataset**.

1. Choose the **Databricks** data source card.

1. For **Data source name**, enter a descriptive name for your Databricks data source connection, for example `Databricks CS`. Because you can create many datasets from a connection to Databricks, it's best to keep the name simple.

1. For **Connection type**, select the type of network you're using. 
   + **Public network** – if your data is shared publicly.
   + **VPC** – if your data is inside a VPC. 
**Note**  
If you're using VPC, and you don't see it listed, check with your administrator. 

1.  For **Database server**, enter the **Hostname of workspace** specified in your Databricks connection details.

1.  For **HTTP Path**, enter the **Partial URL for the spark instance** specified in your Databricks connection details.

1.  For **Port**, enter the **port** specified in your Databricks connection details.

1.  For **Username** and **Password**, enter your connection credentials.

1.  To verify the connection is working, click **Validate connection**.

1.  To finish and create the data source, click **Create data source**.

## Adding a new Quick Sight dataset for Databricks
Adding a new Quick Sight dataset for Databricks

After you have an existing data source connection for Databricks data, you can create Databricks datasets to use for analysis. 

**To create a dataset using Databricks**

1. Choose **Data** at left, then scroll down to find the data source card for your Databricks connection. If you have many data sources, you can use the search bar at the top of the page to find your data source with a partial match on the name.

1. Choose the **Databricks** data source card, and then choose **Create data set**.

1. To specify the table you want to connect to, first select the Catalog and Schema you want to use. Then, for **Tables**, select the table that you want to use. If you prefer to use your own SQL statement, select **Use custom SQL**. 

1. Choose **Edit/Preview**.

1. (Optional) To add more data, use the following steps: 

   1. Choose **Add data** at top right.

   1. To connect to different data, choose **Switch data source**, and choose a different dataset. 

   1. Follow the UI prompts to finish adding data. 

   1. After adding new data to the same dataset, choose **Configure this join **(the two red dots). Set up a join for each additional table. 

   1. If you want to add calculated fields, choose **Add calculated field**. 

   1. To add a model from SageMaker AI, choose **Augment with SageMaker**. This option is only available in Quick Enterprise edition.

   1. Clear the check box for any fields that you want to omit.

   1. Update any data types that you want to change.

1. When you are done, choose **Save** to save and close the dataset. 

## Quick Sight Administrator's guide to connecting Databricks
Quick Sight Admin Topic: Databricks connections

You can use Amazon Quick Sight to connect to Databricks on AWS. You can connect to Databricks on AWS whether you signed up for through AWS Marketplace or through the Databricks website. 

Before you can connect to Databricks, your create or identify existing resources that the connection requires. Use this section to help you gather the resources you need to connect from Quick Sight to Databricks.
+ To learn how to obtain your Databricks connection details, see [Databricks ODBC and JDBC connections](https://docs.databricks.com/integrations/jdbc-odbc-bi.html#get-server-hostname-port-http-path-and-jdbc-url).. 
+ To learn how to obtain your Databricks credentials—personal access token or user name and password—for authentication, see [Authentication requirements](https://docs.databricks.com/integrations/bi/jdbc-odbc-bi.html#authentication-requirements) in the [Databricks documentation](https://docs.databricks.com/index.html). 

  To connect to a Databricks cluster, you need `Can Attach To` and `Can Restart` permissions. These permissions are managed in Databricks. For more information, see [Permission Requirements](https://docs.databricks.com/integrations/jdbc-odbc-bi.html#permission-requirements) in the [Databricks documentation](https://docs.databricks.com/index.html)..
+ If you are setting up a private connection for Databricks, you can learn more about how to configure a VPC for use with Quick Sight, see [Connecting to a VPC with Amazon Quick Sight](https://docs.aws.amazon.com/quicksight/latest/user/working-with-aws-vpc.html) in the Quick Sight documentation. If the connection isnt' visible, verify with a system administrator that the network has open [inbound endpoints for Amazon Route 53](https://docs.aws.amazon.com/quicksight/latest/user/vpc-route-53.html). the hostname of a Databricks workspace uses a public IP , there needs to be DNS TCP and DNS UDP inbound and outbound rules to allow traffic on DNS port 53, for the Route 53 security group. An administrator needs to create a security group with 2 inbound rules: one for DNS(TCP) on port 53 to the VPC CIDR and one for DNS(UDP) for port 53 to the VPC CIDR. 

  For Databricks-related details if you are using PrivateLink instead of a public connection, see [Enable AWS PrivateLink](https://docs.databricks.com/administration-guide/cloud-configurations/aws/privatelink.html) in the [Databricks documentation](https://docs.databricks.com/index.html). 

# Creating a dataset using Google BigQuery
Google BigQuery

**Note**  
When Quick Sight uses and transfers information that is received from Google APIs, it adheres to the [Google API Services User Data Policy](https://developers.google.com/terms/api-services-user-data-policy).

Google BigQuery is a fully managed serverless data warehouse that customers use to manage and analyze their data. Google BigQuery customers use SQL to query their data without any infrastructure management.

## Creating a data source connection with Google BigQuery


**Prerequisites**

Before you start, make sure that you have the following. These are all required to create a data source connection with Google BigQuery:
+ **Project ID** – The project ID that is associated with your Google account. To find this, navigate to the Google Cloud console and choose the name of the project that you want to connect to Quick Sight. Copy the project ID that appears in the new window and record it for later use.
+ **Dataset Region** – The Google region that the Google BigQuery project exists in. To find the dataset region, navigate to the Google BigQuery console and choose **Explorer**. Locate and expand the project that you want to connect to, then choose the dataset that you want to use. The dataset region appears in the pop-up that opens.
+ **Google account login credentials** – The login credentials for your Google account. If you don't have this information, contact your Google account administrator.
+ **Google BigQuery Permissions** – To connect your Google account with Quick Sight, make sure that your Google account has the following permissions:
  + `BigQuery Job User` at the `Project` level.
  + `BigQuery Data Viewer` at the `Dataset` or `Table` level.
  + `BigQuery Metadata Viewer` at the `Project` level.

For information about how to retrieve the previous prerequisite information, see [Unlock the power of unified business intelligence with Google Cloud BigQuery and Quick Sight](https://aws.amazon.com/blogs/business-intelligence/unlock-the-power-of-unified-business-intelligence-with-google-cloud-bigquery-and-amazon-quicksight/).

Use the following procedure to connect your Quick account with your Google BigQuery data source.

**To create a new connection to a Google BigQuery data source from Quick Sight**

1. Open the [Quick console](https://quicksight.aws.amazon.com/).

1. From the left navigation pane, choose **Data**.

1. Choose **Create** then choose **New Dataset**

1. Choose the **Google BigQuery** tile.

1. Add the data source details that you recorded in the prerequisites section earlier:
   + **Data source name** – A name for the data source.
   + **Project ID** – A Google Platform project ID. This field is case sensitive.
   + **Dataset Region** – The Google cloud platform dataset region of the project that you want to connect to.

1. Choose **Sign in**.

1. In the new window that opens, enter the login credentials for the Google account that you want to connect to.

1. Choose **Continue** to grant Quick Sight access to Google BigQuery.

1. After you create the new data source connection, continue to [Step 4](#gbq-step-4) in the following procedure.

## Adding a new Quick Sight dataset for Google BigQuery


After you create a data source connection with Google BigQuery, you can create Google BigQuery datasets for analysis. Datasets that use Google BigQuery can only be stored in SPICE.

**To create a dataset using Google BigQuery**

1. Open the [Quick console](https://quicksight.aws.amazon.com/).

1. From the start page, choose **Data**.

1. Choose **Create**, then **New Dataset**

1. Choose the **Google BigQuery** tile, and then choose **Create dataset**.

1. <a name="gbq-step-4"></a>For **Tables**, do one of the following:
   + Choose the table that you want to use.
   + Choose **Use custom SQL** to use your own personal SQL statement. For more information about using custom SQL in Quick Sight, see [Using SQL to customize data](adding-a-SQL-query.md).

1. Choose **Edit/Preview**.

1. (Optional) In the **Data prep** page that opens, you can add customizations to your data with calculated fields, filters, and joins.

1. When you are finished making changes, choose **Save** to save and close the dataset.

# Creating a dataset using a Google Sheets data source
Google Sheets

Google Sheets is a web-based spreadsheet application that enables users to create, edit, and collaborate on data in real time. With its comprehensive set of functions and formulas, it serves as a powerful data source for business intelligence and analytics. Users can organize, analyze, and share insights efficiently, while its seamless collaboration features make it an ideal platform for teams working on data-driven projects.

## Admin configuration in Amazon Quick


Amazon Quick administrators need to perform a one-time setup to enable Google Sheets as a data source. For detailed instructions and important considerations, see [the blog](https://aws.amazon.com//blogs/business-intelligence/transform-your-google-sheets-data-into-powerful-analytics-with-amazon-quicksight/).

## Creating a dataset using a Google Sheets data source


Use the following procedure to create a dataset using a Google Sheets data source.

**To create a dataset using a Google Sheets data source**

1. From the Quick start page, choose **Datasets**.

1. On the **Datasets** page, choose **New Dataset**.

1. Choose **Google Sheets**.

1. Enter a name for the data source, and then choose **Connect**.

1. When redirected to Google's sign-in page, do the following:

   1. Enter your Google account credentials, and then choose **Next**.

   1. Review the permissions to authorize your AWS account to connect with Google Sheets, and then choose **Continue**.

1. In the **Choose your table** menu, locate your data. The menu displays all folders, subfolders, sheets, and tabs from your Google account. To display the tabs, select a sheet from the displayed list.

1. Select the tab you want to work with.

1. Choose **Edit/Preview data** to navigate to the Data preparation page. Choose **Add data** to include any additional tabs.

1. Configure the join, and then select **Publish & visualize** to analyze your Google Sheets data with Quick Sight.

**Note**  
This connector supports only SPICE functionality.
If your OAuth token expires (visible in the ingestion error report or when creating a new dataset), reauthorize by choosing **Edit** on the data source and updating it.

# Creating a dataset using an Apache Impala data source
Impala

Apache Impala is a high-performance massively parallel processing (MPP) SQL query engine designed to run natively on Apache Hadoop. Use the procedure below to establish a secure connection between Quick Sight and Apache Impala.

All traffic between Quick Sight and Apache Impala is encrypted using SSL. Quick Sight supports standard username and password authentication for Impala connections.

To establish a connection, you'll need to configure SSL settings in your Impala instance, prepare your authentication credentials, set up the connection in Quick Sight using your Impala server details, and validate the connection to ensure secure data access.

**To create a dataset using an Apache Impala data source**

1. On the Quick start page, choose **Data**.

1. On the **Data** page, choose **Create**.

1. Choose **Data source**.

1. Choose **Impala**, then choose **Next**.

1. Enter a name for the data source.

1. For public connections:

   1. Enter connection details for **Database server**, **HTTP Path**, **Port**, **Username**, and **Password**.

   1. Once the validation is successful, choose **Create data source**.

1. For private connections:

   1. Coordinate with your administrator to set up a VPC connection before entering connection details.

     You or your administrator can [configure the VPC connection in Quick](vpc-creating-a-connection-in-quicksight.md). SSL is enabled by default to ensure secure data transmission. If you encounter connection validation errors, please verify your connection and VPC details.

     If issues persist, consult your administrator to confirm that your Certificate Authority is included in Quick Sight's [approved list of certificates](configure-access.md#ca-certificates).

1. In the **Choose your table** menu, you can either:

   1. Choose a specific schema or table, then choose **Select**.

   1. Choose **Use custom SQL** to write your own SQL query.

1. After completing your selection, you will be redirected to the data preparation page. Make any adjustments to your data, then choose **Publish & visualize** to analyze your Impala data in Quick Sight.

**Note**  
This connector supports:  
Username and password authentication
Public and private connections
Table discovery and custom SQL queries
Full data refresh during ingestion
SPICE storage only

# Creating a dataset using a Microsoft Excel file
Microsoft Excel files

To create a dataset using a Microsoft Excel file data source, upload an .xlsx file from a local or networked drive. The data is imported into [SPICE](spice.md).

 For more information about creating new Amazon S3 datasets using Amazon S3 data sources, see [Creating a dataset using an existing Amazon S3 data source](create-a-data-set-existing.md#create-a-data-set-existing-s3) or [Creating a dataset using Amazon S3 files](create-a-data-set-s3.md). 

**To create a dataset based on an excel file**

1. Check [Data source quotas](data-source-limits.md) to make sure that your target file doesn't exceed data source quotas.

1. On the Quick start page, choose **Data**.

1. On the **Data** page, choose **Create ** then **New dataset**.

1. Choose **Upload a file**.

1. In the **Open** dialog box, choose a file, and then choose **Open**.

   A file must be 1 GB or less to be uploaded to Quick Sight.

1. If the Excel file contains multiple sheets, choose the sheet to import. You can change this later by preparing the data. 

1. 
**Note**  
On the following screens, you have multiple chances to prepare the data. Each of these takes you to the **Prepare Data** screen. This screen is the same one where you can access after the data import is complete. It enables you to change the upload settings even after the upload is complete.

    Choose **Select** to confirm your settings. Or you can choose **Edit/Preview data** to prepare the data immediately.

   A preview of the data appears on the next screen. You can't make changes directly to the data preview. 

1. If the data headings and content don't look correct, choose **Edit settings and prepare data** to correct the file upload settings. 

   Otherwise, choose **Next**.

1. On the **Data Source Details** screen, you can choose **Edit/Preview data**. You can specify a dataset name in the **Prepare Data** screen. 

   If you don't need to prepare the data, you can choose to create an analysis using the data as-is. Choose **Visualize**. Doing this names the dataset the same as the source file, and takes you to the **Analysis** screen. To learn more about data preparation and excel upload settings, see [Preparing data in Amazon Quick Sight](preparing-data.md).

**Note**  
If at anytime you want to make changes to the file, such as adding a new field,you must make the change in Microsoft Excel and create a new dataset using the updated version in Quick Sight. For more information about possible implications of changing datasets, see [Things to consider when editing datasets](edit-a-data-set.md#change-a-data-set) .

# Creating a data source using Presto
Presto

Presto (or PrestoDB) is an open-source, distributed SQL query engine, designed for fast analytic queries against data of any size. It supports both nonrelational and relational data sources. Supported nonrelational data sources include the Hadoop Distributed File System (HDFS), Amazon S3, Cassandra, MongoDB, and HBase. Supported relational data sources include MySQL, PostgreSQL, Amazon Redshift, Microsoft SQL Server, and Teradata. 

For more information about Presto, see the following:
+ [Introduction to presto](https://aws.amazon.com/big-data/what-is-presto/), a description of Presto on the AWS website.
+ [Creating a presto cluster with Amazon elastic MapReduce (EMR)](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-presto.html) in the *Amazon EMR Release Guide.*
+ For general information on Presto, see the [Presto documentation](https://trino.io/docs/current/).

The results of the queries that you run through the Presto query engine can be turned into Quick Sight datasets. Presto processes the analytic queries on the backend databases. Then it returns results to the Quick Sight client. You can directly query your data through Presto, or you can import the results of your query into SPICE. 

Before you use Quick Sight as a Presto client to run queries, make sure that you configure data source profiles. You need a data source profile in Quick Sight for each Presto data source that you want to access. Use the following procedure to create a connection to Presto.

**To create a new connection to a presto data source from Amazon Quick Sight (console)**

1. On the Amazon Quick Sight start page, choose **Data** at left.

1. Choose **Create** then **New dataset**. 

1. Choose the **Presto** tile. 
**Note**  
In most browsers, you can use Ctrl-F or Cmd-F to open a search box and enter **presto** to locate it. 

1. Add the settings for the new data source:
   + ****Data source name**** – Enter a descriptive name for your data source connection. This name appears in the **Existing data sources** section at the bottom of the **Data sets** screen. 
   + ****Connection type**** – Choose the connection type that you need to use to connect to Presto. 

     To connect through the public network, choose **Public network**. 

     If you use a public network, your Presto server must be secured and authenticated using Lightweight Directory Access Protocol (LDAP). For information on configuring Presto to use LDAP, see [LDAP authentication](https://trino.io/docs/current/security/ldap.html) in the Presto documentation. 

     To connect through a virtual private connection, choose the appropriate VPC name from the **VPC connections** list. 

     If your Presto server allows unauthenticated access, AWS requires that you connect to it securely by using a private VPC connection. For information on configuring a new VPC, see [Configuring VPC connections in Amazon Quick Sight](working-with-aws-vpc.md).
   + ****Database server**** – The name of the database server. 
   + ****Port**** – The port that the server using to accept incoming connections from Amazon Quick Sight 
   + ****Catalog**** – The name of the catalog that you want to use. 
   + ****Authentication required**** – (Optional) This option only appears if you choose a VPC connection type. If the Presto data source that you're connecting to doesn't require authentication, choose **No**. Otherwise, keep the default setting (**Yes**). 
   + ****Username**** – Enter a user name to use to connect to Presto. Quick Sight applies the same user name and password to all connections that use this data source profile. If you want to monitor Quick Sight separately from other accounts, create a Presto account for each Quick Sight data source profile. 

     The Presto account that you use needs be able to access to the database and run `SELECT` statements on at least one table. 
   + ****Password**** – The password to use with the Presto user name. Amazon Quick Sight encrypts all credentials that you use in data source profile. For more information, see [Data encryption in Amazon Quick](data-encryption.md). 
   + ****Enable SSL**** – SSL is enabled by default. 

1. Choose **Validate connection** to test your settings.

1. After you validate your settings, choose **Create data source** to complete the connection.

# Using Snowflake with Amazon Quick Sight
Snowflake

Snowflake is an AI data cloud platform that provides data solutions from data warehousing and collaboration to data science and generative AI. Snowflake is an [AWS Partner](https://partners.amazonaws.com/partners/001E000000d8qQcIAI/Snowflake) with multiple AWS accreditations that include AWS ISV Competencies in Generative AI, Machine Learning, Data and Analytics, and Retail.

Amazon Quick Sight offers two ways to connect to Snowflake: with your Snowflake login credentials or with OAuth client credentials. Use the following sections to learn about both methods of connection.

**Topics**
+ [

## Creating an Quick Sight data source connection to Snowflake with login credentials
](#create-connection-to-snowflake)
+ [

## Creating an Quick Sight data source connection to Snowflake with OAuth client credentials
](#create-connection-to-snowflake-oauth-credentials)

## Creating an Quick Sight data source connection to Snowflake with login credentials
Connecting with login credentials

 Use this section to learn how to create a connection between Quick Sight and Snowflake with your Snowflake login credentials. All traffic between Quick Sight and Snowflake is enabled by SSL.

**To create a connection between Quick Sight and Snowflake**

1. Open the [Quick console](https://quicksight.aws.amazon.com/).

1. From the left navigation pane, choose **Data**, then choose **Create**, then choose **New Dataset**.

1. Choose the **Snowflake** data source card.

1. In the pop up that appears, enter the following information:

   1. For **Data source name**, enter a descriptive name for your Snowflake data source connection. Because you can create many datasets from a connection to Snowflake, it's bets to keep the name simple.

   1. For **Connection type**, choose the type of network that you're using. Choose **Public network** if your data is shared publicly. Choose **VPC** if your data is located inside a VPC. To configure a VPC connection in Quick Sight, see [Managing VPC connection in Amazon Quick](vpc-creating-a-connection-in-quicksight.md).

   1. For **Database server** enter the hostname specified in your Snowflake connection details.

1. For **Database name and Warehouse**, enter the respective Snowflake database and wearehouse that you want to connect.

1. For **Username** and **Password**, enter your Snowflake credentials.

After you have successfully created a data source connection between your Quick Sight and Snowflake accounts, you can begin [Creating datasets](creating-data-sets.md) that contain Snowflake data.

## Creating an Quick Sight data source connection to Snowflake with OAuth client credentials
Connecting with OAuth client credentials

You can use OAuth client credentials to connect your Quick Sight account with Snowflake through the [Quick Sight APIs](https://docs.aws.amazon.com/quicksight/latest/APIReference/API_CreateDataSource.html). *OAuth* is a standard authorization protocol that is often utilized for applications that have advanced security requirements. When you connect to Snowflake with OAuth client credentials, you can create datasets that contain Snowflake data with the Quick Sight APIs and in the Quick Sight UI. For more information about configuring OAuth in Snowflake, see [Snowflake OAuth overview](https://docs.snowflake.com/en/user-guide/oauth-snowflake-overview).

Quick Sight supports the `client credentials` OAuth grant type. OAuth client credentials is used to obtain an access token for machine-to-machine communication. This method is suitable for scenarios where a client needs to access resources that are hosted on a server without the involvement of a user.

In the client credentials flow of OAuth 2.0, there are several client authentication mechanisms that can be used to authenticate the client application with the authorization server. Quick Sight supports client credentials based OAuth for Snowflake for the following two mechanisms:
+ **Token (Client secrets-based OAuth)**: The secret-based client authentication mechanism is used with the client credentials to grant flow in order to authenticate with authorization server. This authentication scheme requires the `client_id` and `client_secret` of the OAuth client app to be stored in Secrets Manager.
+ **X509 (Client private key JWT-based OAuth)**: The X509 certificate key-based solution provides an additional security layer to the OAuth mechanism with client certificates that are used to authenticate instead of client secrets. This method is primarily used by private clients who use this method to authenticate with the authorization server with strong trust between the two services.

Quick Sight has validated OAuth connections with the following Identity providers:
+ OKTA
+ PingFederate

### Storing OAuth credentials in Secrets Manager
Storing OAuth credentials

OAuth client credentials are meant for machine-to-machine use cases and are not designed to be interactive. To create a datasource connection between Quick Sight and Snowflake, create a new secret in Secrets Manager that contains your credentials for the OAuth client app. The secret ARN that is created with the new secret can be used to create datasets that contain Snowflake data in Quick Sight. For more information about using Secrets Manager keys in Quick Sight, see [Using AWS Secrets Manager secrets instead of database credentials in Quick](secrets-manager-integration.md).

The credentials that you need to store in Secrets Manager are determined by the OAuth mechanism that you use. The following key/value pairs are required for X509-based OAuth secrets:
+ `username`: The Snowflake account username to be used when connecting to Snowflake
+ `client_id`: The OAuth client ID
+ `client_private_key`: The OAuth client private key
+ `client_public_key`: The OAuth client certificate public key and its encrypted algorithm (for example, `{"alg": "RS256", "kid", "cert_kid"}`)

The following key/value pairs are required for token-based OAuth secrets:
+ `username`: The Snowflake account username to be used when connecting to Snowflake
+ `client_id`: The OAuth client ID
+ `client_secret`: the OAuth client secret

### Creating a Snowflake OAuth connection with the Quick Sight APIs
Example

After you create a secret in Secrets Manager that contains your Snowflake OAuth credentials and have connected your Quick account to Secrets Manager, you can establish a data source connection between Quick Sight and Snowflake with the Quick Sight APIs and SDK. The following example creates a Snowflake data source connection using token OAuth client credentials.

```
{
    "AwsAccountId": "AWSACCOUNTID",
    "DataSourceId": "UNIQUEDATASOURCEID",
    "Name": "NAME",
    "Type": "SNOWFLAKE",
    "DataSourceParameters": {
        "SnowflakeParameters": {
            "Host": "HOSTNAME",
            "Database": "DATABASENAME",
            "Warehouse": "WAREHOUSENAME",
            "AuthenticationType": "TOKEN",
            "DatabaseAccessControlRole": "snowflake-db-access-role-name",
            "OAuthParameters": {
              "TokenProviderUrl": "oauth-access-token-endpoint", 
              "OAuthScope": "oauth-scope",
              "IdentityProviderResourceUri" : "resource-uri",
              "IdentityProviderVpcConnectionProperties" : {
                "VpcConnectionArn": "IdP-VPC-connection-ARN" 
             }
        }
    },
    "VpcConnectionProperties": {
        "VpcConnectionArn": "VPC-connection-ARN-for-Snowflake"
    }
    "Credentials": {
        "SecretArn": "oauth-client-secret-ARN"
    }
}
```

For more information about the CreateDatasource API operation, see [CreateDataSource](https://docs.aws.amazon.com/quicksight/latest/APIReference/API_CreateDataSource.html).

Once the connection between Quick Sight and Snowflake is established and a data source is created with the Quick Sight APIs or SDK, the new data source is displayed in Quick Sight. Quick Sight authors can use this data source to create datasets that contain Snowflake data. Tables are displayed based on the role used in the `DatabaseAccessControlRole` parameter that is passed in a `CreateDataSource` API call. If this parameter is not defined when the data source connection is created, the default Snowflake role is used.

After you have successfully created a data source connection between your Quick Sight and Snowflake accounts, you can begin [Creating datasets](creating-data-sets.md) that contain Snowflake data.

# Using Starburst with Amazon Quick Sight
Starburst

Starburst is a full-featured data lake analytics service built on top of a massively parallel processing (MPP) query engine, Trino. Use this section to learn how to connect from Amazon Quick Sight to Starburst. All traffic between Quick Sight and Starburst is enabled by SSL. If you're connecting to Starburst Galaxy, you can get the necessary connection details by logging in to your Starburst Galaxy account, then choose **Partner Connect** and then **Quick Sight**. You should be able to see information, such as hostname and port. Amazon Quick Sight supports basic username and password authentication to Starburst.

Quick Sight offers two ways to connect to Starburst: with your Starburst login credentials or with OAuth client credentials. Use the following sections to learn about both methods of connection.

**Topics**
+ [

## Creating an Quick Sight data source connection to Starburst with login credentials
](#create-connection-to-starburst)
+ [

## Creating an Quick Sight data source connection to Starburst with OAuth client credentials
](#create-connection-to-starburst-oauth)

## Creating an Quick Sight data source connection to Starburst with login credentials
Connecting with login credentials

1. Begin by creating a new dataset. From the left navigation pane, choose **Data**, then choose **Create**, then choose **New Dataset**.

1. Choose the **Starburst** data source card.

1. Select the Starburst product type. Choose **Starburst Enterprise** for on-prem Starburst instances. Choose **Starburst Galaxy** for managed instances.

1. For **Data source name**, enter a descriptive name for your Starburst data source connection. Because you can create many datasets from a connection to Starburst, it's best to keep the name simple.

1. For **Connection type**, select the type of network you're using. Choose **Public network** if your data is shared publicly. Choose **VPC** if your data is inside a VPC. To configure a VPC connection in Amazon Quick Sight, see [ Configuring the VPC connection in Amazon Quick Sight](https://docs.aws.amazon.com/quicksight/latest/user/vpc-creating-a-connection-in-quicksight.html). This connection type is not available for Starburst Galaxy.

1. For **Database server** enter the hostname specified in your Starburst connection details.

1. For **Catalog**, enter the catalog specified in your Starburst connection details.

1. For **Port**, enter the port specified in your Starburst connection details. Defaults to 443 for Starburst Galaxy.

1. For **Username** and **Password**, enter your Starburst connection credentials.

1. To verify the connection is working, choose **Validate connection**.

1. To finish and create the data source, choose **Create data source**.

**Note**  
Connectivity between Amazon Quick Sight and Starburst was validated using Starburst version 420.

After you have successfully created a data source connection between your Quick Sight and Starburst accounts, you can begin [Creating datasets](creating-data-sets.md) that contain Starburst data.

## Creating an Quick Sight data source connection to Starburst with OAuth client credentials
Connecting with OAuth client credentials

You can use OAuth client credentials to connect your Quick Sight account with Starburst through the [Quick Sight APIs](https://docs.aws.amazon.com/quicksight/latest/APIReference/API_CreateDataSource.html). *OAuth* is a standard authorization protocol that is often utilized for applications that have advanced security requirements. When you connect to Starburst with OAuth client credentials, you can create datasets that contain Starburst data with the Quick Sight APIs and in the Quick Sight UI. For more information about configuring OAuth in Starburst, see [OAuth 2.0 authentication](https://docs.starburst.io/latest/security/oauth2.html).

Quick Sight supports the `client credentials` OAuth grant type. OAuth client credentials is used to obtain an access token for machine-to-machine communication. This method is suitable for scenarios where a client needs to access resources that are hosted on a server without the involvement of a user.

In the client credentials flow of OAuth 2.0, there are several client authentication mechanisms that can be used to authenticate the client application with the authorization server. Quick Sight supports client credentials based OAuth for Starburst for the following two mechanisms:
+ **Token (Client secrets-based OAuth)**: The secret-based client authentication mechanism is used with the client credentials to grant flow in order to authenticate with authorization server. This authentication scheme requires the `client_id` and `client_secret` of the OAuth client app to be stored in Secrets Manager.
+ **X509 (Client private key JWT-based OAuth)**: The X509 certificate key-based solution provides an additional security layer to the OAuth mechanism with client certificates that are used to authenticate instead of client secrets. This method is primarily used by private clients who use this method to authenticate with the authorization server with strong trust between the two services.

Quick Sight has validated OAuth connections with the following Identity providers:
+ OKTA
+ PingFederate

### Storing OAuth credentials in Secrets Manager
Storing OAuth credentials

OAuth client credentials are meant for machine-to-machine use cases and are not designed to be interactive. To create a datasource connection between Quick Sight and Starburst, create a new secret in Secrets Manager that contains your credentials for the OAuth client app. The secret ARN that is created with the new secret can be used to create datasets that contain Starburst data in Quick Sight. For more information about using Secrets Manager keys in Quick Sight, see [Using AWS Secrets Manager secrets instead of database credentials in Quick](secrets-manager-integration.md).

The credentials that you need to store in Secrets Manager are determined by the OAuth mechanism that you use. The following key/value pairs are required for X509-based OAuth secrets:
+ `username`: The Starburst account username to be used when connecting to Starburst
+ `client_id`: The OAuth client ID
+ `client_private_key`: The OAuth client private key
+ `client_public_key`: The OAuth client certificate public key and its encrypted algorithm (for example, `{"alg": "RS256", "kid", "cert_kid"}`)

The following key/value pairs are required for token-based OAuth secrets:
+ `username`: The Starburst account username to be used when connecting to Starburst
+ `client_id`: The OAuth client ID
+ `client_secret`: the OAuth client secret

### Creating a Starburst OAuth connection with the Quick Sight APIs
Example

After you create a secret in Secrets Manager that contains your Starburst OAuth credentials and have connected your Quick account to Secrets Manager, you can establish a data source connection between Quick Sight and Starburst with the Quick Sight APIs and SDK. The following example creates a Starburst data source connection using token OAuth client credentials.

```
{
    "AwsAccountId": "AWSACCOUNTID",
    "DataSourceId": "DATASOURCEID",
    "Name": "NAME",
    "Type": "STARBURST",
    "DataSourceParameters": {
        "StarburstParameters": {
            "Host": "STARBURST_HOST_NAME",
            "Port": "STARBURST_PORT",
            "Catalog": "STARBURST_CATALOG",
            "ProductType": "STARBURST_PRODUCT_TYPE",     
            "AuthenticationType": "TOKEN",
            "DatabaseAccessControlRole": "starburst-db-access-role-name",
            "OAuthParameters": {
              "TokenProviderUrl": "oauth-access-token-endpoint", 
              "OAuthScope": "oauth-scope",
              "IdentityProviderResourceUri" : "resource-uri",
              "IdentityProviderVpcConnectionProperties" : {
                "VpcConnectionArn": "IdP-VPC-connection-ARN"
            }
        }
    },
    "VpcConnectionProperties": {
        "VpcConnectionArn": "VPC-connection-ARN-for-Starburst"
    },
    "Credentials": {
        "SecretArn": "oauth-client-secret-ARN"
    }
}
```

For more information about the CreateDatasource API operation, see [CreateDataSource](https://docs.aws.amazon.com/quicksight/latest/APIReference/API_CreateDataSource.html).

Once the connection between Quick Sight and Starburst is established and a data source is created with the Quick Sight APIs or SDK, the new data source is displayed in Quick Sight. Quick Sight authors can use this data source to create datasets that contain Starburst data. Tables are displayed based on the role used in the `DatabaseAccessControlRole` parameter that is passed in a `CreateDataSource` API call. If this parameter is not defined when the data source connection is created, the default Starburst role is used.

After you have successfully created a data source connection between your Quick Sight and Starburst accounts, you can begin [Creating datasets](creating-data-sets.md) that contain Starburst data.

# Creating a data source and data set from SaaS sources
SaaS sources

To analyze and report on data from software as a service (SaaS) applications, you can use SaaS connectors to access your data directly from Quick Sight. The SaaS connectors simplify accessing third-party application sources using OAuth, without any need to export the data to an intermediate data store.

You can use either a cloud-based or server-based instance of a SaaS application. To connect to an SaaS application that is running on your corporate network, make sure that Quick Sight can access the application's Domain Name System (DNS) name over the network. If Quick Sight can't access the SaaS application, it generates an unknown host error. 

Here are examples of some ways that you can use SaaS data:
+ Engineering teams who use Jira to track issues and bugs can report on developer efficiency and bug burndown. 
+ Marketing organizations can integrate Quick Sight with Adobe Analytics to build consolidated dashboards to visualize their online and web marketing data.

Use the following procedure to create a data source and dataset by connecting to sources available through Software as a Service (SaaS). In this procedure, we use a connection to GitHub as an example. Other SaaS data sources follow the same process, although the screens—especially the SaaS screens—might look different.

**To create a data source and dataset by connecting to sources through SaaS**

1. On the Quick start page, choose **Data**.

1. On the **Data** page, choose **Create ** then choose **New dataset**.

1. Choose the icon that represents the SaaS source that you want to use. For example, you might choose Adobe Analytics or GitHub.

   For sources using OAuth, the connector takes you to the SaaS site to authorize the connection before you can create the data source. 

1. Choose a name for the data source, and enter that. If there are more screen prompts, enter the appropriate information. Then choose **Create data source**.

1. If you are prompted to do so, enter your credentials on the SaaS login page.

1. When prompted, authorize the connection between your SaaS data source and Quick Sight.

   The following example shows the authorization for Quick Sight to access the GitHub account for the Quick Sight documentation.
**Note**  
Quick Sight documentation is now available on GitHub. If you want to make changes to this user guide, you can use GitHub to edit it directly.

   (Optional) If your SaaS account is part of an organizational account, you might be asked to request organization access as part of authorizing Quick Sight. If you want to do this, follow the prompts on your SaaS screen, then choose to authorize Quick Sight.

1. After authorization is complete, choose a table or object to connect to. Then choose **Select**.

1. On the **Finish data set creation** screen, choose one of these options:
   + To save the data source and dataset, choose **Edit/Preview data**. Then choose **Save** from the top menu bar.
   + To create a dataset and an analysis using the data as-is, choose **Visualize**. This option automatically saves the data source and the dataset.

     You can also choose **Edit/Preview data** to prepare the data before creating an analysis. This opens the data preparation screen. For more information about data preparation, see [Preparing dataset examples](preparing-data-sets.md).

The following constraints apply:
+ The SaaS source must support REST API operations for Quick Sight to connect to it.
+ If you are connecting to Jira, the URL must be public address.
+ If you don't have enough [SPICE](spice.md) capacity, choose **Edit/Preview data**. In the data preparation screen, you can remove fields from the dataset to decrease its size or apply a filter that reduces the number of rows returned. For more information about data preparation, see [Preparing dataset examples](preparing-data-sets.md).

# Creating a dataset from Salesforce
Salesforce

Use the following procedure to create a dataset by connecting to Salesforce and selecting a report or object to provide data.

**To create a dataset using Salesforce from a report or object**

1. Check [Data source quotas](data-source-limits.md) to make sure that your target report or object doesn't exceed data source quotas.

1. On the Quick start page, choose **Data**.

1. On the **Data** page, choose **Create** then **New dataset**.

1. Choose the **Salesforce** icon.

1. Enter a name for the data source and then choose **Create data source**.

1. On the Salesforce login page, enter your Salesforce credentials.

1. For **Data elements: contain your data**, choose **Select** and then choose either **REPORT** or **OBJECT**.
**Note**  
Joined reports aren't supported as Quick Sight data sources.

1. Choose one of the following options:
   + To prepare the data before creating an analysis, choose **Edit/Preview data** to open data preparation. For more information about data preparation, see [Preparing dataset examples](preparing-data-sets.md).
   + Otherwise, choose a report or object and then choose **Select**.

1. Choose one of the following options:
   + To create a dataset and an analysis using the data as-is, choose **Visualize**.
**Note**  
If you don't have enough [SPICE](spice.md) capacity, choose **Edit/Preview data**. In data preparation, you can remove fields from the dataset to decrease its size or apply a filter that reduces the number of rows returned. For more information about data preparation, see [Preparing dataset examples](preparing-data-sets.md).
   + To prepare the data before creating an analysis, choose **Edit/Preview data** to open data preparation for the selected report or object. For more information about data preparation, see [Preparing dataset examples](preparing-data-sets.md).

**Note**  
The Salesforce connector is not supported in embedded console deployments where users authenticate through namespace isolation. The OAuth authentication flow requires direct Amazon Quick Sight console access to complete the sign-in process.

# Using Trino with Amazon Quick Sight
Trino

Trino is a massively parallel processing (MPP) query engine built to quickly query data lakes with petabytes of data. Use this section to learn how to connect from Amazon Quick Sight to Trino. All traffic between Amazon Quick Sight and Trino is enabled by SSL. Amazon Quick Sight supports basic username and password authentication to Trino.

## Creating a data source connection for Trino
Trino

1. Begin by creating a new dataset. From the left navigation pane, choose **Data**. Choose **Create** then **New Dataset**.

1. Choose the **Trino** data source card.

1. For **Data source name**, enter a descriptive name for your Trino data source connection. Because you can create many datasets from a connection to Trino, it's best to keep the name simple.

1. For **Connection type**, select the type of network you're using. Choose **Public network** if your data is shared publicly. Choose **VPC** if your data is inside a VPC. To configure a VPC connection in Amazon Quick Sight, see [ Configuring the VPC connection in Amazon Quick Sight](https://docs.aws.amazon.com/quicksight/latest/user/vpc-creating-a-connection-in-quicksight.html).

1. For **Database server**, enter the hostname specified in your Trino connection details.

1. For **Catalog**, enter the catalog specified in your Trino connection details.

1. For **Port**, enter the port specified in your Trino connection details.

1. For **Username** and **Password**, enter your Trino connection credentials.

1. To verify the connection is working, choose **Validate connection**.

1. To finish and create the data source, choose **Create data source**.

## Adding a new Amazon Quick Sight dataset for Trino


After you go through the [ data source creation process](https://docs.aws.amazon.com/create-connection-to-starburst.html) for Trino, you can create Trino datasets to use for analysis. You can create new datasets from a new or an existing Trino data source. When you are creating a new data source, Amazon Quick Sight immediately takes you to creating a dataset, which is step 3 below. If you're using an existing data source to create a new dataset, start from step 1 below.

To create a dataset using a Trino data source, see the following steps.

1. From the start page, choose **Data**. Choose **Create** then **New dataset**.

1. Choose the Trino data source you created.

1. Choose **Create data set**.

1. To specify the table you want to connect to, choose a schema. If you don't want to choose a schema, you can also use your own SQL statement.

1. To specify the table you want to connect to, first select the **Schema** you want to use. For **Tables**, choose the table that you want to use. If you prefer to use your own SQL statement, select **Use custom SQL**.

1. Choose **Edit/Preview**.

1. (Optional) To add more data, use the following steps:

1. Choose **Add data** in the top right.

1. To connect to different data, choose **Switch data source**, and choose a different dataset.

1. Follow the prompts to finish adding data.

1. After adding new data to the same dataset, choose **Configure this join** (the two red dots). Set up a join for each additional table.

1. If you want to add calculated fields, choose **Add calculated field**.

1. Clear the check box for any fields that you want to omit.

1. Update any data types that you want to change.

1. When you are done, choose **Save** to save and close the dataset.

**Note**  
Connectivity between Quick Sight and Trino was validated using Trino version 410.

# Creating a dataset using a local text file
Text files

To create a dataset using a local text file data source, identify the location of the file, and then upload it. The file data is automatically imported into [SPICE](spice.md) as part of creating a dataset. 

**To create a dataset based on a local text file**

1. Check [Data source quotas](data-source-limits.md) to make sure that your target file doesn't exceed data source quotas.

   Supported file types include .csv, .tsv, .json, .clf, or .elf files.

1. On the Quick start page, choose **Data**.

1. Choose **Create ** then **New dataset**.

1. Choose **Upload a file**.

1. In the **Open** dialog box, browse to a file, select it, and then choose **Open**.

   A file must be 1 GB or less to be uploaded to Quick Sight.

1. To prepare the data before creating the dataset, choose **Edit/Preview data**. Otherwise, choose **Visualize** to create an analysis using the data as-is. 

   If you choose the former, you can specify a dataset name as part of preparing the data. If you choose the latter, a dataset with the same name as the source file is created. To learn more about data preparation, see [Preparing data in Amazon Quick Sight](preparing-data.md).

# Using Amazon Timestream data with Amazon Quick Sight
Timestream data

Following, you can find how to connect to your Amazon Timestream data using Amazon Quick Sight. For a brief overview, see the [Getting started with Amazon Timestream and Amazon QuickSight](https://youtu.be/TzW4HWl-L8s) video tutorial on YouTube. 

## Creating a new Amazon Quick Sight data source connection for a Timestream database
Creating a data source connection for Timestream

Following, you can find how to connect to Amazon Timestream from Amazon Quick Sight.

Before you can proceed, Amazon Quick Sight needs to be authorized to connect to Amazon Timestream. If connections aren't enabled, you get an error when you try to connect. A Quick Sight administrator can authorize connections to AWS resources. To authorize, open the menu by clicking on your profile icon at top right. Choose **Manage QuickSight**, **Security & permissions**, **Add or remove**. Then enable the check box for Amazon Timestream, then choose **Update** to confirm. For more information, see [Configuring Amazon Quick Sight access to AWS data sources](access-to-aws-resources.md).

**To connect to Amazon Timestream**

1. Begin by creating a new dataset. Choose **Data** from the navigation pane at left. 

1. Choose **Create** then **New Dataset**.

1. Choose the Timestream data source card.

1. For **Data source name**, enter a descriptive name for your Timestream data source connection, for example `US Timestream Data`. Because you can create many datasets from a connection to Timestream, it's best to keep the name simple.

1. Choose **Validate connection** to check that you can successfully connect to Timestream.

1. Choose **Create data source** to proceed.

1. For **Database**, choose **Select** to view the list of available options. 

1. Choose the one you want to use, then choose **Select** to continue. 

1. Do one of the following:
   + To import your data into Quick Sight's in-memory engine (called SPICE), choose **Import to SPICE for quicker analytics**. 
   + To allow Quick Sight to run a query against your data each time you refresh the dataset or use the analysis or dashboard, choose **Directly query your data**. 

   If you want to enable autorefresh on a published dashboard that uses Timestream data, the Timestream dataset needs to use a direct query.

1. Choose **Edit/Preview** and then **Save** to save your dataset and close it.

1. Repeat these steps for the number of concurrent direct connections to Timestream that you want to open in a dataset. For example, let's say you want to use four tables in a Quick Sight dataset. Currently, Quick Sight datasets connect to only one table at a time from a Timestream data source. To use four tables in the same dataset, you need to add four data source connections in Quick Sight. 

## Managing permissions for Timestream data
Managing Timestream permissions

The following procedure describes how to view, add, and revoke permissions to allow access to the same Timestream data source. The people that you add need to be active users in Quick Sight before you can add them. 

**To edit permissions on a dataset**

1. Choose **Data** at left, then scroll down to find the dataset for your Timestream connection. An example might be `US Timestream Data`.

1. Choose the **Timestream** dataset to open it.

1. On the dataset details page that opens, choose the **Permissions**tab.

   A list of current permissions appears.

1. To add permissions, choose **Add users & groups**, then follow these steps:

   1. Add users or groups to allow them to use the same dataset.

   1. When you're finished adding everyone that you want to add, choose the **Permissions** that you want to apply to them.

1. (Optional) To edit permissions, you can choose **Viewer** or **Owner**. 
   + Choose **Viewer** to allow read access.
   + Choose **Owner** to allow that user to edit, share, or delete this Quick Sight data source. 

1. (Optional) To revoke permissions, choose **Revoke access**. After you revoke someone's access, they can't create edit, share, or delete the dataset.

1. When you are finished, choose **Close**.

## Adding a new Quick Sight dataset for Timestream
Adding a new Quick Sight dataset for Amazon Timestream

After you have an existing data source connection for Timestream data, you can create Timestream datasets to use for analysis. 

Currently, you can use a Timestream connection only for a single table in a dataset. To add data from multiple Timestream tables in a single dataset, create an additional Quick Sight data source connection for each table.

**To create a dataset using Amazon Timestream**

1. Choose **Data** at left, then scroll down to find the data source card for your Timestream connection. If you have many data sources, you can use the search bar at the top of the page to find your data source with a partial match on the name.

1. Choose the **Timestream** data source card, and then choose **Create data set**.

1. For **Database**, choose **Select** to view a list of available databases and choose the one that you want to use.

1. For **Tables**, choose the table that you want to use.

1. Choose **Edit/Preview**.

1. (Optional) To add more data, use the following steps: 

   1. Choose **Add data** at top right.

   1. To connect to different data, choose **Switch data source**, and choose a different dataset. 

   1. Follow the UI prompts to finish adding data. 

   1. After adding new data to the same dataset, choose **Configure this join **(the two red dots). Set up a join for each additional table. 

   1. If you want to add calculated fields, choose **Add calculated field**. 

   1. To add a model from SageMaker AI, choose **Augment with SageMaker**. This option is only available in Amazon Quick Enterprise edition.

   1. Clear the check box for any fields that you want to omit.

   1. Update any data types that you want to change.

1. When you are done, choose **Save** to save and close the dataset. 

## Adding Timestream data to an analysis


Following, you can find how to add an Amazon Timestream dataset to a Quick Sight analysis. Before you begin, make sure that you have an existing dataset that contains the Timestream data that you want to use.

**To add Amazon Timestream data to an analysis**

1. Choose **Analyses** at left.

1. Do one of the following:
   + To create a new analysis, choose **New analysis** at right. 
   + To add to an existing analysis, open the analysis that you want to edit. 
     + Choose the pencil icon near at top left.
     + Choose **Add data set**.

1. Choose the Timestream dataset that you want to add.

For more information, see [Working with analyses](https://docs.aws.amazon.com/quicksight/latest/user/working-with-analyses.html).

# Data source quotas


Data sources that you use with Amazon Quick Sight must conform to the following quotas.

**Topics**
+ [

## SPICE quotas for imported data
](#spice-limits)
+ [

## Quotas for direct SQL queries
](#query-limits)

## SPICE quotas for imported data


When you create a new dataset in Amazon Quick Sight, [SPICE](spice.md) limits the number of rows you can add to a dataset. You can ingest data into SPICE from a query or from a file. Each file can have up to 2,000 columns. Each column name can have up to 127 Unicode characters. Each field can have up to 2,047 Unicode characters.If you use the new data preparation experience to create your SPICE dataset, each field can have up to 65,534 Unicode characters. 

To retrieve a subset of data from a larger set, you can deselect columns or apply filters to reduce the size of the data. If you are importing from Amazon S3, each manifest can specify up to 1,000 files. 

Quotas for SPICE are as follows:
+  2,047 Unicode characters for each field. (65,534 Unicode characters with new data preparation experience) 
+ 127 Unicode characters for each column name
+ 2,000 columns for each file
+ 1,000 files for each manifest
+ For Standard edition, 25 million (25,000,000) rows or 25 GB for each dataset
+ For Enterprise edition, 2 billion (2,000,000,000) rows or 2 TB for each dataset

All quotas apply to SPICE datasets with row-level security, as well.

In rare cases, if you're ingesting large rows into SPICE, you might reach the quota for gigabytes per dataset before you reach the quota on rows. The size is based on the SPICE capacity the data occupies after ingestion into SPICE. 

## Quotas for direct SQL queries


If you aren't importing data into SPICE, different quotas apply for space and time. For operations such as connecting, sampling data for a dataset, and generating visuals, timeouts can occur. In some cases, these are timeout quotas set by the source database engine. In other cases, such as visualizing, Amazon Quick Sight generates a timeout after 2 minutes.

However, not all database drivers react to the 2-minute timeout, for example Amazon Redshift. In these cases, the query runs for as long as it takes for the response to return, which can result in long-running queries on your database. When this happens, you can cancel the query from the database server to free up database resources. Follow the instructions for your database server about how to do this. For example, for more information on how to cancel queries in Amazon Redshift, see [Canceling a query in Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/dg/cancel_query.html), and [Implementing workload management in Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/dg/cm-c-implementing-workload-management.html) in the *Amazon Redshift Database Developer Guide*.

Each result set from a direct query can have up to 2,000 columns. Each column name can have up to 127 Unicode characters. If you want to retrieve data from a larger table, you can use one of several methods to reduce the size of the data. You can deselect columns, or apply filters. In a SQL query, you can also use predicates, such as `WHERE`, `HAVING`. If your visuals time out during a direct query, you can simplify your query to optimize execution time or you can import the data into SPICE. 

Quotas for queries are as follows:
+ 127 Unicode characters for each column name.
+ 2,000 columns for each dataset.
+ 2-minute quota for generating a visual, or an optional dataset sample.
+ Data source timeout quotas apply (varies for each database engine).

# Supported data types and values


Amazon Quick Sight currently supports the following primitive data types: `Date`, `Decimal`, `Integer`, and `String`. The following data types are supported in SPICE: `Date`, `Decimal-fixed`, `Decimal-float`, `Integer`, and `String`. Quick Sight accepts Boolean values by promoting them to integers. It can also derive geospatial data types. Geospatial data types use metadata to interpret the physical data type. Latitude and longitude are numeric. All other geospatial categories are strings. 

Make sure that any table or file that you use as a data source contains only fields that can be implicitly converted to these data types. Amazon Quick Sight skips any fields or columns that can't be converted. If you get an error that says "fields were skipped because they use unsupported data types", alter your query or table to remove or recast unsupported data types.

## String and text data


Fields or columns that contain characters are called *strings*. A field with the data type of `STRING` can initially contain almost any type of data. Examples include names, descriptions, phone numbers, account numbers, JSON data, cities, post codes, dates, and numbers that can be used to calculate. These types are sometimes called textual data in a general sense, but not in a technical sense. Quick Sight doesn't support binary and character large objects (BLOBs) in dataset columns. In the Quick Sight documentation, the term "text" always means "string data". 

The first time you query or import the data, Quick Sight tries to interpret the data that it identifies as other types, for example dates and numbers. It's a good idea to verify that the data types assigned to your fields or columns are correct. 

For each string field in imported data, Quick Sight uses a field length of 8 bytes plus the UTF-8 encoded character length. Amazon Quick Sight supports UTF-8 file encoding, but not UTF-8 (with BOM).

## Date and time data


Fields with a data type of `Date` also include time data, and are also known as `Datetime` fields. Quick Sight supports dates and times that use [supported date formats](#supported-date-formats). 

Quick Sight uses UTC time for querying, filtering, and displaying date data. When date data doesn't specify a time zone, Quick Sight assumes UTC values. When date data does specify a time zone, Quick Sight converts it to display in UTC time. For example, a date field with a time zone offset like **2015-11-01T03:00:00-08:00** is converted to UTC and displayed in Amazon Quick Sight as **2015-11-01T15:30:00**. 

For each `DATE` field in imported data, Quick Sight uses a field length of 8 bytes. Quick Sight supports UTF-8 file encoding, but not UTF-8 (with BOM).

## Numeric data


Numeric data includes integers and decimals. Integers with a data type of `INT` are negative or positive numbers that don't have a decimal place. Quick Sight doesn't distinguish between large and small integers. Integers over a value of `9007199254740991` or `2^53 - 1` might not display exactly or correctly in a visual.

Decimals with the data type of `Decimal` are negative or positive numbers that contain at least one decimal place before or after the decimal point. When you choose Direct Query mode, all non-integer decimal types are marked as `Decimal` and the underlying engine handles the precision of the datapoint based on the data source's supported behaviors. For more information on supported data source types, see [Supported data types and values](#supported-data-types-and-values).

When you store your dataset in SPICE, you can choose to store your decimal values as `fixed` or `float` decimal types. `Decimal-fixed` data types use the format of decimal (`18,4`) that allow 18 digits total and up to 4 digits after the decimal point. `Decimal-fixed` data types are a good choice to conduct exact mathematical operations, but Quick Sight rounds the value to the nearest ten thousandth place when the value is ingested into SPICE.

`Decimal-float` data types provide approximately 16 significant digits of accuracy to a value. The significant digits can be on either side of the decimal point to support numbers with many decimal places and higher numbers at the same time. For example, the `Decimal-float` data type supports the number `12345.1234567890` or the number `1234567890.12345`. If you work with very small numbers that are close to `0`, the `Decimal-float` data type supports up to 15 digits to the right of the decimal point, for example `0.123451234512345`. The maximum value that this data type supports is `1.8 * 10^308` to minimize the probability of an overflow error with your data set.

The `Decimal-float` data type is inexact and some values are stored as approximations instead of the real value. This may result in slight descrepencies when you store and return some specific values. The following considerations apply to the `Decimal-float` data type.
+ If the dataset that you're using comes from an Amazon S3 data source, SPICE assigns the `Decimal-float` decimal type to all numeric decimal values.
+ If the dataset that you're using comes from a database, SPICE uses the decimal type that the value is assigned in the database. For example, if the value is assigned a fixed-point numeric value in the database, the value will be a `Decimal-fixed` type in SPICE.

For existing SPICE datasets that contain fields that can be converted to the `Decimal-float` data type, a pop-up appears in the **Edit dataset** page. To convert fields of an existing dataset to the `Decimal-float` data type, choose **UPDATE FIELDS**. If you don't want to opt in, choose **DO NOT UPDATE FIELDS**. The **Update fields** pop up appears every time you open the **Edit dataset** page until the dataset is saved and published.

## Supported data types from external data sources


The following table lists data types that are supported when using the following data sources with Amazon Quick Sight. 


****  

| Database engine or source | Numeric data types | String data types | Datetime data types | Boolean data types | 
| --- | --- | --- | --- | --- | 
|   **Amazon Athena, Presto, Starburst, Trino**  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  | 
|  **Amazon Aurora**, **MariaDB**, and **MySQL**  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  |  | 
|   **Amazon OpenSearch Service**  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  | 
|  **Oracle**  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html) | [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html) | [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html) | bit | 
|   **PostgreSQL**   |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  | 
|   **Apache Spark**  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  | 
|   **Snowflake**   |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  | 
|   **Microsoft SQL Server**   |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  | 

### Supported date formats


Amazon Quick Sight supports the date and time formats described in this section. Before you add data to Amazon Quick Sight, check if your date format is compatible. If you need to use an unsupported format, see [Using unsupported or custom dates](using-unsupported-dates.md).

The supported formats vary depending on the data source type, as follows:


| Data source | Clocks | Date formats | 
| --- | --- | --- | 
|  File uploads Amazon S3 sources Athena Salesforce  |  Both 24-hour and 12-hour clocks  |  Supported date and time formats are described in the Joda API documentation.  For a complete list of Joda date formats, see [Class DateTimeFormat](http://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.html) on the Joda website. For datasets stored in memory (SPICE), Amazon Quick Sight supports dates in the following range: `Jan 1, 0001 00:00:00 UTC` through `Dec 31, 9999, 23:59:59 UTC`.   | 
|  Relational databases sources  |  24-hour clock only  |  The following data and time formats: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/supported-data-types-and-values.html)  | 

### Unsupported values in data


If a field contains values that don't conform with the data type that Amazon Quick Sight assigns to the field, the rows containing those values are skipped. For example, take the following source data.

```
Sales ID    Sales Date    Sales Amount
--------------------------------------
001        10/14/2015        12.43
002        5/3/2012          25.00
003        Unknown           18.17
004        3/8/2009          86.02
```

Amazon Quick Sight interprets **Sales Date** as a date field and drops the row containing a nondate value, so only the following rows are imported.

```
Sales ID    Sales Date    Sales Amount
--------------------------------------
001        10/14/2015        12.43
002        5/3/2012          25.00
004        3/8/2009          86.02
```

In some cases, a database field might contain values that the JDBC driver can't interpret for the source database engine. In such cases, the uninterpretable values are replaced by null so that the rows can be imported. The only known occurrence of this issue is with MySQL date, datetime, and timestamp fields that have all-zero values, for example **0000-00-00 00:00:00**. For example, take the following source data.

```
Sales ID    Sales Date                Sales Amount
---------------------------------------------------
001        2004-10-12 09:14:27        12.43
002        2012-04-07 12:59:03        25.00
003        0000-00-00 00:00:00        18.17
004        2015-09-30 01:41:19        86.02
```

In this case, the following data is imported.

```
Sales ID    Sales Date                Sales Amount
---------------------------------------------------
001        2004-10-12 09:14:27        12.43
002        2012-04-07 12:59:03        25.00
003        (null)                     18.17
004        2015-09-30 01:41:19        86.02
```

# Working with datasets


Datasets are the foundation of your Quick Sight analytics, serving as the prepared and structured data sources that power your analyses and dashboards. Once you've created datasets from your data sources, you need to manage them effectively throughout their lifecycle to ensure reliable, secure, and collaborative analytics.

This section covers the complete dataset management workflow, from editing and versioning datasets to sharing them with team members and implementing security controls. You'll learn how to maintain dataset integrity while supporting collaborative analytics, track which analyses depend on your datasets, and implement both row-level and column-level security to protect sensitive information. Whether you're preparing datasets for team use, troubleshooting analysis issues, or implementing data governance policies, these topics provide the essential knowledge for effective dataset management in Quick Sight.

**Topics**
+ [

# Creating datasets
](creating-data-sets.md)
+ [

# Editing datasets
](edit-a-data-set.md)
+ [

# Reverting datasets back to previous published versions
](dataset-versioning.md)
+ [

# Duplicating datasets
](duplicate-a-data-set.md)
+ [

# Sharing datasets
](sharing-data-sets.md)
+ [

# Tracking dashboards and analyses that use a dataset
](track-analytics-that-use-dataset.md)
+ [

# Using dataset parameters in Amazon Quick
](dataset-parameters.md)
+ [

# Using row-level security in Amazon Quick
](row-level-security.md)
+ [

# Using column-level security to restrict access to a dataset
](restrict-access-to-a-data-set-using-column-level-security.md)
+ [

# Running queries as an IAM role in Amazon Quick
](datasource-run-as-role.md)
+ [

# Deleting datasets
](delete-a-data-set.md)
+ [

# Adding a dataset to an analysis
](adding-a-data-set-to-an-analysis.md)

# Creating datasets


 You can create datasets from new or existing data sources in Amazon Quick. You can use a variety of database data sources to provide data to Amazon Quick. This includes Amazon RDS instances and Amazon Redshift clusters. It also includes MariaDB, Microsoft SQL Server, MySQL, Oracle, and PostgreSQL instances in your organization, Amazon EC2, or similar environments. 

**Topics**
+ [

# Creating datasets using new data sources
](creating-data-sets-new.md)
+ [

# Creating a dataset using an existing data source
](create-a-data-set-existing.md)
+ [

# Creating a dataset using an existing dataset in Amazon Quick
](create-a-dataset-existing-dataset.md)

# Creating datasets using new data sources
From new data sources

When you create a dataset based on an AWS service like Amazon RDS, Amazon Redshift, or Amazon EC2, data transfer charges might apply when consuming data from that source. Those charges might also vary depending on whether that AWS resource is in the home AWS Region that you chose for your Amazon Quick account. For details on pricing, see the pricing page for the service in question.

When creating a new database dataset, you can select one table, join several tables, or create a SQL query to retrieve the data that you want. You can also change whether the dataset uses a direct query or instead stores data in [SPICE](spice.md).

**To create a new dataset**

1. To create a dataset, choose **New data set** on the **Data** page. You can then create a dataset based on an existing dataset or data source, or connect to a new data source and base the dataset on that.

1. Provide connection information to the data source:
   + For local text or Microsoft Excel files, you can simply identify the file location and upload the file.
   + For Amazon S3, provide a manifest identifying the files or buckets that you want to use, and also the import settings for the target files.
   + For Amazon Athena, all Athena databases for your AWS account are returned. No additional credentials are required.
   + For Salesforce, provide credentials to connect with.
   + For Amazon Redshift, Amazon RDS, Amazon EC2, or other database data sources, provide information about the server and database that host the data. Also provide valid credentials for that database instance.

# Creating a dataset from a database


The following procedures walk you through connecting to database data sources and creating datasets. To create datasets from AWS data sources that your Amazon Quick account autodiscovered, use [Creating a dataset from an autodiscovered Amazon Redshift cluster or Amazon RDS instance](#create-a-data-set-autodiscovered). To create datasets from any other database data sources, use [Creating a dataset using a database that's not autodiscovered](#create-a-data-set-database). 

## Creating a dataset from an autodiscovered Amazon Redshift cluster or Amazon RDS instance


Use the following procedure to create a connection to an autodiscovered AWS data source.

**To create a connection to an autodiscovered AWS data source**

1. Check [Data source quotas](data-source-limits.md) to make sure that your target table or query doesn't exceed data source quotas.

1. Confirm that the database credentials you plan to use have appropriate permissions as described in [Required permissions](required-permissions.md). 

1. Make sure that you have configured the cluster or instance for Amazon Quick access by following the instructions in [Network and database configuration requirements](configure-access.md).

1. On the Amazon Quick start page, choose **Data**.

1. Choose **Create ** then choose **New dataset**.

1. Choose either the **RDS** or the **Redshift Auto-discovered** icon, depending on the AWS service that you want to connect to.

1. Enter the connection information for the data source, as follows:
   + For **Data source name**, enter a name for the data source.
   + For **Instance ID**, choose the name of the instance or cluster that you want to connect to.
   + **Database name** shows the default database for the **Instance ID** cluster or instance. To use a different database on that cluster or instance, enter its name.
   + For **UserName**, enter the user name of a user account that has permissions to do the following: 
     + Access the target database. 
     + Read (perform a `SELECT` statement on) any tables in that database that you want to use.
   + For **Password**, enter the password for the account that you entered.

1. Choose **Validate connection** to verify your connection information is correct.

1. If the connection validates, choose **Create data source**. If not, correct the connection information and try validating again.
**Note**  
Amazon Quick automatically secures connections to Amazon RDS instances and Amazon Redshift clusters by using Secure Sockets Layer (SSL). You don't need to do anything to enable this.

1. Choose one of the following:
   + **Custom SQL**

     On the next screen, you can choose to write a query with the **Use custom SQL** option. Doing this opens a screen named **Enter custom SQL query**, where you can enter a name for your query, and then enter the SQL. For best results, compose the query in a SQL editor, and then paste it into this window. After you name and enter the query, you can choose **Edit/Preview data** or **Confirm query**. Choose **Edit/Preview data** to immediately go to data preparation. Choose **Confirm query** to validate the SQL and make sure that there are no errors.
   + **Choose tables**

     To connect to specific tables, for **Schema: contain sets of tables**, choose **Select** and then choose a schema. In some cases where there is only a single schema in the database, that schema is automatically chosen, and the schema selection option isn't displayed.

     To prepare the data before creating an analysis, choose **Edit/Preview data** to open data preparation. Use this option if you want to join to more tables.

     Otherwise, after choosing a table, choose **Select**.

1. Choose one of the following options:
   + Prepare the data before creating an analysis. To do this, choose **Edit/Preview data** to open data preparation for the selected table. For more information about data preparation, see [Preparing dataset examples](preparing-data-sets.md).
   + Create a dataset and analysis using the table data as-is and to import the dataset data into SPICE for improved performance (recommended). To do this, check the table size and the SPICE indicator to see if you have enough capacity.

     If you have enough SPICE capacity, choose **Import to SPICE for quicker analytics**, and then create an analysis by choosing **Visualize**.
**Note**  
If you want to use SPICE and you don't have enough space, choose **Edit/Preview data**. In data preparation, you can remove fields from the dataset to decrease its size. You can also apply a filter or write a SQL query that reduces the number of rows or columns returned. For more information about data preparation, see [Preparing dataset examples](preparing-data-sets.md).
   + To create a dataset and an analysis using the table data as-is, and to have the data queried directly from the database, choose the **Directly query your data** option. Then create an analysis by choosing **Visualize**.

## Creating a dataset using a database that's not autodiscovered


Use the following procedure to create a connection to any database other than an autodiscovered Amazon Redshift cluster or Amazon RDS instance. Such databases include Amazon Redshift clusters and Amazon RDS instances that are in a different AWS Region or are associated with a different AWS account. They also include MariaDB, Microsoft SQL Server, MySQL, Oracle, and PostgreSQL instances that are on-premises, in Amazon EC2, or in some other accessible environment.

**To create a connection to a database that isn't an autodiscovered Amazon Redshift cluster or RDS instance**

1. Check [Data source quotas](data-source-limits.md) to make sure that your target table or query doesn't exceed data source quotas.

1. Confirm that the database credentials that you plan to use have appropriate permissions as described in [Required permissions](required-permissions.md). 

1. Make sure that you have configured the cluster or instance for Amazon Quick access by following the instructions in [Network and database configuration requirements](configure-access.md).

1. On the Amazon Quick start page, choose **Manage data**.

1. Choose **Create ** then choose **New data set**.

1. Choose the **Redshift Manual connect** icon if you want to connect to an Amazon Redshift cluster in another AWS Region or associated with a different AWS account. Or choose the appropriate database management system icon to connect to an instance of Amazon Aurora, MariaDB, Microsoft SQL Server, MySQL, Oracle, or PostgreSQL.

1. Enter the connection information for the data source, as follows:
   + For **Data source name**, enter a name for the data source.
   + For **Database server**, enter one of the following values:
     + For an Amazon Redshift cluster or Amazon RDS instance, enter the endpoint of the cluster or instance without the port number. For example, if the endpoint value is `clustername.1234abcd.us-west-2.redshift.amazonaws.com:1234`, then enter `clustername.1234abcd.us-west-2.redshift.amazonaws.com`. You can get the endpoint value from the **Endpoint** field on the cluster or instance detail page in the AWS console.
     + For an Amazon EC2 instance of MariaDB, Microsoft SQL Server, MySQL, Oracle, or PostgreSQL, enter the public DNS address. You can get the public DNS value from the **Public DNS** field on the instance detail pane in the Amazon EC2 console.
     + For a non-Amazon EC2 instance of MariaDB, Microsoft SQL Server, MySQL, Oracle, or PostgreSQL, enter the hostname or public IP address of the database server. If you are using Secure Sockets Layer (SSL) for a secured connection (recommended), you likely need to provide the hostname to match the information required by the SSL certificate. For a list of accepted certificates see [Amazon Quick SSL and CA certificates](configure-access.md#ca-certificates).
   + For **Port**, enter the port that the cluster or instance uses for connections.
   + For **Database name**, enter the name of the database that you want to use.
   + For **UserName**, enter the user name of a user account that has permissions to do the following: 
     + Access the target database. 
     + Read (perform a `SELECT` statement on) any tables in that database that you want to use.
   + For **Password**, enter the password associated with the account you entered.

1. (Optional) If you are connecting to anything other than an Amazon Redshift cluster and you *don't* want a secured connection, make sure that **Enable SSL** is clear. *We strongly recommend leaving this checked*, because an unsecured connection can be open to tampering. 

   For more information on how the target instance uses SSL to secure connections, see the documentation for the target database management system. Amazon Quick doesn't accept self-signed SSL certificates as valid. For a list of accepted certificates, see [Amazon Quick SSL and CA certificates](configure-access.md#ca-certificates).

   Amazon Quick automatically secures connections to Amazon Redshift clusters by using SSL. You don't need to do anything to enable this.

   Some databases, such as Presto and Apache Spark, must meet additional requirements before Amazon Quick can connect. For more information, see [Creating a data source using Presto](create-a-data-source-presto.md), or [Creating a data source using Apache Spark](create-a-data-source-spark.md).

1. (Optional) Choose **Validate connection** to verify your connection information is correct.

1. If the connection validates, choose **Create data source**. If not, correct the connection information and try validating again.

1. Choose one of the following:
   + **Custom SQL**

     On the next screen, you can choose to write a query with the **Use custom SQL** option. Doing this opens a screen named **Enter custom SQL query**, where you can enter a name for your query, and then enter the SQL. For best results, compose the query in a SQL editor, and then paste it into this window. After you name and enter the query, you can choose **Edit/Preview data** or **Confirm query**. Choose **Edit/Preview data** to immediately go to data preparation. Choose **Confirm query** to validate the SQL and make sure that there are no errors.
   + **Choose tables**

     To connect to specific tables, for **Schema: contain sets of tables**, choose **Select** and then choose a schema. In some cases where there is only a single schema in the database, that schema is automatically chosen, and the schema selection option isn't displayed.

     To prepare the data before creating an analysis, choose **Edit/Preview data** to open data preparation. Use this option if you want to join to more tables.

     Otherwise, after choosing a table, choose **Select**.

1. Choose one of the following options:
   + Prepare the data before creating an analysis. To do this, choose **Edit/Preview data** to open data preparation for the selected table. For more information about data preparation, see [Preparing dataset examples](preparing-data-sets.md).
   + Create a dataset and an analysis using the table data as-is and import the dataset data into SPICE for improved performance (recommended). To do this, check the table size and the SPICE indicator to see if you have enough space.

     If you have enough SPICE capacity, choose **Import to SPICE for quicker analytics**, and then create an analysis by choosing **Visualize**.
**Note**  
If you want to use SPICE and you don't have enough space, choose **Edit/Preview data**. In data preparation, you can remove fields from the dataset to decrease its size. You can also apply a filter or write a SQL query that reduces the number of rows or columns returned. For more information about data preparation, see [Preparing dataset examples](preparing-data-sets.md).
   + Create a dataset and an analysis using the table data as-is and have the data queried directly from the database. To do this, choose the **Directly query your data** option. Then create an analysis by choosing **Visualize**.

# Creating a dataset using an existing data source
From existing data sources

After you make an initial connection to a Salesforce, AWS data store, or other database data source, Amazon Quick saves the connection information. It adds the data source to the **FROM EXISTING DATA SOURCES** section of the **Create a Data Set** page. You can use these existing data sources to create new datasets without respecifying connection information.

## Creating a dataset using an existing Amazon S3 data source


Use the following procedure to create a dataset using an existing Amazon S3 data source.

**To create a dataset using an existing S3 data source**

1. On the Amazon Quick start page, choose **Data**.

1. Choose **Create** then choose **New dataset**.

1. Choose the Amazon S3 data source to use.

1. To prepare the data before creating the dataset, choose **Edit/Preview data**. To create an analysis using the data as-is, choose **Visualize**.

## Creating a dataset using an existing Amazon Athena data source


To create a dataset using an existing Amazon Athena data source, use the following procedure.

**To create a dataset from an existing Athena connection profile**

1. On the Amazon Quick start page, choose **Data**.

1. Choose **Create ** then choose **New data set**.

   Choose the connection profile icon for the existing data source that you want to use. Connection profiles are labeled with the data source icon and the name provided by the person who created the connection.

1. Choose **Create data set**.

   Amazon Quick creates a connection profile for this data source based only on the Athena workgroup. The database and table aren't saved. 

1. On the **Choose your table** screen, do one of the following:
   + To write a SQL query, choose **Use custom SQL**.
   + To choose a database and table, first select your database from the **Database** list. Next, choose a table from the list that appears for your database.

## Create a dataset using an existing Salesforce data source


Use the following procedure to create a dataset using an existing Salesforce data source.

**To create a dataset using an existing Salesforce data source**

1. On the Amazon Quick start page, choose **Data**.

1. Choose **Create ** then choose **New data set**.

1. Choose the Salesforce data source to use.

1. Choose **Create Data Set**.

1. Choose one of the following:
   + **Custom SQL**

     On the next screen, you can choose to write a query with the **Use custom SQL** option. Doing this opens a screen named **Enter custom SQL query**, where you can enter a name for your query, and then enter the SQL. For best results, compose the query in a SQL editor, and then paste it into this window. After you name and enter the query, you can choose **Edit/Preview data** or **Confirm query**. Choose **Edit/Preview data** to immediately go to data preparation. Choose **Confirm query** to validate the SQL and make sure that there are no errors.
   + **Choose tables**

     To connect to specific tables, for **Data elements: contain your data**, choose **Select** and then choose either **REPORT** or **OBJECT**. 

     To prepare the data before creating an analysis, choose **Edit/Preview data** to open data preparation. Use this option if you want to join to more tables.

     Otherwise, after choosing a table, choose **Select**.

1. On the next screen, choose one of the following options:
   + To create a dataset and an analysis using the data as-is, choose **Visualize**.
**Note**  
If you don't have enough [SPICE](spice.md) capacity, choose **Edit/Preview data**. In data preparation, you can remove fields from the dataset to decrease its size or apply a filter that reduces the number of rows returned. For more information about data preparation, see [Preparing dataset examples](preparing-data-sets.md).
   + To prepare the data before creating an analysis, choose **Edit/Preview data** to open data preparation for the selected report or object. For more information about data preparation, see [Preparing dataset examples](preparing-data-sets.md).

## Creating a dataset using an existing database data source


Use the following procedure to create a dataset using an existing database data source.

**To create a dataset using an existing database data source**

1. On the Amazon Quick start page, choose **Data**.

1. Choose **Create** then choose **New data set**.

1. Choose the database data source to use, and then choose **Create Data Set**.

1. Choose one of the following:
   + **Custom SQL**

     On the next screen, you can choose to write a query with the **Use custom SQL** option. Doing this opens a screen named **Enter custom SQL query**, where you can enter a name for your query, and then enter the SQL. For best results, compose the query in a SQL editor, and then paste it into this window. After you name and enter the query, you can choose **Edit/Preview data** or **Confirm query**. Choose **Edit/Preview data** to immediately go to data preparation. Choose **Confirm query** to validate the SQL and make sure that there are no errors.
   + **Choose tables**

     To connect to specific tables, for **Schema: contain sets of tables**, choose **Select** and then choose a schema. In some cases where there is only a single schema in the database, that schema is automatically chosen, and the schema selection option isn't displayed.

     To prepare the data before creating an analysis, choose **Edit/Preview data** to open data preparation. Use this option if you want to join to more tables.

     Otherwise, after choosing a table, choose **Select**.

1. Choose one of the following options:
   + Prepare the data before creating an analysis. To do this, choose **Edit/Preview data** to open data preparation for the selected table. For more information about data preparation, see [Preparing dataset examples](preparing-data-sets.md).
   + Create a dataset and an analysis using the table data as-is and import the dataset data into [SPICE](spice.md) for improved performance (recommended). To do this, check the SPICE indicator to see if you have enough space.

     If you have enough SPICE capacity, choose **Import to SPICE for quicker analytics**, and then create an analysis by choosing **Visualize**.
**Note**  
If you want to use SPICE and you don't have enough space, choose **Edit/Preview data**. In data preparation, you can remove fields from the dataset to decrease its size. You can also apply a filter or write a SQL query that reduces the number of rows or columns returned. For more information about data preparation, see [Preparing dataset examples](preparing-data-sets.md).
   + Create a dataset and an analysis using the table data as-is and have the data queried directly from the database. To do this, choose the **Directly query your data** option. Then create an analysis by choosing **Visualize**.

# Creating a dataset using an existing dataset in Amazon Quick
From existing datasets

After you create a dataset in Amazon Quick, you can create additional datasets using it as a source. When you do this, any data preparation that the parent dataset contains, such as any joins or calculated fields, is kept. You can add additional preparation to the data in the new child datasets, such as joining new data and filtering data. You can also set up your own data refresh schedule for the child dataset and track the dashboards and analyses that use it.

Child datasets that are created using a dataset with RLS rules active as a source inherit the parent dataset's RLS rules. Users who are creating a child dataset from a larger parent dataset can only see the data that they have access to in the parent dataset. Then, you can add more RLS rules to the new child dataset in addition to the inherited RLS rules to further manage who can access the data that is in the new dataset. You can only create child datasets from datasets with RLS rules active in Direct Query.

Creating datasets from existing Quick datasets has the following advantages:
+ **Central management of datasets** – Data engineers can easily scale to the needs of multiple teams within their organization. To do this, they can develop and maintain a few general-purpose datasets that describe the organization's main data models.
+ **Reduction of data source management** – Business analysts (BAs) often spend lots of time and effort requesting access to databases, managing database credentials, finding the right tables, and managing Quick data refresh schedules. Building new datasets from existing datasets means that BAs don't have to start from scratch with raw data from databases. They can start with curated data.
+ **Predefined key metrics** – By creating datasets from existing datasets, data engineers can centrally define and maintain critical data definitions across their company's many organizations. Examples might be sales growth and net marginal return. With this feature, data engineers can also distribute changes to those definitions. This approach means that their business analysts can get started with visualizing the right data more quickly and reliably.
+ **Flexibility to customize data** – By creating datasets from existing datasets, business analysts get more flexibility to customize datasets for their own business needs. They can avoid worry about disrupting data for other teams.

For example, let's say that you're part of an ecommerce central team of five data engineers. You and your team has access to sales, orders, cancellations, and returns data in a database. You have created a Quick dataset by joining 18 other dimension tables through a schema. A key metric that your team has created is the calculated field, order product sales (OPS). Its definition is: OPS = product quantity x price.

Your team serves over 100 business analysts across 10 different teams in eight countries. These are the Coupons team, the Outbound Marketing team, the Mobile Platform team, and the Recommendations team. All of these teams use the OPS metric as a base to analyze their own business line.

Rather than manually creating and maintaining hundreds of unconnected datasets, your team reuses datasets to create multiple levels of datasets for teams across the organization. Doing this centralizes data management and allows each team to customize the data for their own needs. At the same time, this syncs updates to the data, such as updates to metric definitions, and maintains row-level and column-level security. For example, individual teams in your organization can use the centralized datasets. They can then combine them with the data specific to their team to create new datasets and build analyses on top of them.

Along with using the key OPS metric, other teams in your organization can reuse column metadata from the centralized datasets that you created. For example, the Data Engineering team can define metadata, such as *name*, *description*, *data type*, and *folders*, in a centralized dataset. All subsequent teams can use it.

**Note**  
Amazon Quick supports creating up to two additional levels of datasets from a single dataset.  
For example, from a parent dataset, you can create a child dataset and then a grandchild dataset for a total of three dataset levels.

## Creating a dataset from an existing dataset


Use the following procedure to create a dataset from an existing dataset.

**To create a dataset from an existing dataset**

1. From the Quick start page, choose **Data** in the pane at left.

1. Choose **Create** then choose the dataset that you want to use to create a new dataset.

1. On the page that opens for that dataset, choose the drop-down menu for **Use in analysis**, and then choose **Use in dataset**.

   The data preparation page opens and preloads everything from the parent dataset, including calculated fields, joins, and security settings.

1. On the data preparation page that opens, for **Query mode** at bottom left, choose how you want the dataset to pull in changes and updates from the original, parent dataset. You can choose the following options: 
   + **Direct query** – This is the default query mode. If you choose this option, the data for this dataset automatically refreshes when you open an associated dataset, analysis, or dashboard. However, the following limitations apply:
     + If the parent dataset allows direct querying, you can use direct query mode in the child dataset.
     + If you have multiple parent datasets in a join, you can choose direct query mode for your child dataset only if all the parents are from the same underlying data source. For example, the same Amazon Redshift connection.
     + Direct query is supported for a single SPICE parent dataset. It is not supported for multiple SPICE parent datasets in a join.
   + **SPICE** – If you choose this option, you can set up a schedule for your new dataset to sync with the parent dataset. For more information about creating SPICE refresh schedules for datasets, see [Refreshing SPICE data](refreshing-imported-data.md).

1. (Optional) Prepare your data for analysis. For more information about preparing data, see [Preparing data in Amazon Quick Sight](preparing-data.md).

1. (Optional) Set up row-level or column-level security (RLS/CLS) to restrict access to the dataset. For more information about setting up RLS, see [Using row-level security with user-based rules to restrict access to a datasetUsing user-based rules](restrict-access-to-a-data-set-using-row-level-security.md). For more information about setting up CLS, see [Using column-level security to restrict access to a dataset](restrict-access-to-a-data-set-using-column-level-security.md).
**Note**  
You can set up RLS/CLS on child datasets only. RLS/CLS on parent datasets is not supported.

1. When you're finished, choose **Save & publish **to save your changes and publish the new child dataset. Or choose **Publish & visualize** to publish the new child dataset and begin visualizing your data. 

# Restricting others from creating new datasets from your dataset


When you create a dataset in Amazon Quick, you can prevent others from using it as a source for other datasets. You can specify if others can use it to create any datasets at all. Or you can specify the type of datasets others can or can't create from your dataset, such as direct query datasets or SPICE datasets.

Use the following procedure to learn how to restrict others from creating new datasets from your dataset.

**To restrict others from creating new datasets from your dataset**

1. From the Quick start page, choose **Data** in the pane at left.

1. Choose **Create** then choose the dataset that you want to restrict creating new datasets from.

1. On the page that opens for that dataset, choose **Edit dataset**.

1. On the data preparation page that opens, choose **Manage** at upper right, and then choose **Properties**.

1. In the **Dataset properties** pane that opens, choose from the following options:
   + To restrict anyone from creating any type of new datasets from this dataset, turn off **Allow new datasets to be created from this one**.

     The toggle is blue when creating new datasets is allowed. It's gray when creating new datasets isn't allowed.
   + To restrict others from creating direct query datasets, clear **Allow direct query**.
   + To restrict others from creating SPICE copies of your dataset, clear **Allow SPICE copies**.

     For more information about SPICE datasets, see [Importing data into SPICE](spice.md).

1. Close the pane.

# Editing datasets


You can edit an existing dataset to perform data preparation. For more information about Quick Sight data preparation functionality, see [Preparing data in Amazon Quick Sight](preparing-data.md).

You can open a dataset for editing from the **Datasets** page, or from the analysis page. Editing a dataset from either location modifies the dataset for all analyses that use it.

## Things to consider when editing datasets


In two situations, changes to a dataset might cause concern. One is if you deliberately edit the dataset. The other is if your data source has changed so much that it affects the analyses based on it. 

**Important**  
Analyses that are in production usage should be protected so they continue to function correctly. 

We recommend the following when you're dealing with data changes:
+ Document your data sources and datasets, and the visuals that rely upon them. Documentation should include screenshots, fields used, placement in field wells, filters, sorts, calculations, colors, formatting, and so on. Record everything that you need to recreate the visual. You can also track which Quick Sight resources use a dataset in the dataset management options. For more information, see [Tracking dashboards and analyses that use a dataset](track-analytics-that-use-dataset.md).
+ When you edit a dataset, try not to make changes that might break existing visuals. For example, don't remove columns that are being used in a visual. If you must remove a column, create a calculated column in its place. The replacement column should have the same name and data type as the original. 
+ If your data source or dataset changes in your source database, adapt your visual to accommodate the change, as described previously. Or you can try to adapt the source database. For example, you might create a view of the source table (document). Then if the table changes, you can adjust the view to include or exclude columns (attributes), change data types, fill null values, and so on. Or, in another circumstance, if your dataset is based on a slow SQL query, you might create a table to hold the results of the query. 

  If you can't sufficiently adapt the source of the data, recreate the visuals based on your documentation of the analysis.
+ If you no longer have access to a data source, your analyses based on that source are empty. The visuals that you created still exist, but they can't display until they have some data to show. This result can happen if permissions are changed by your administrator.
+ If you remove the dataset a visual is based on, you might need to recreate it from your documentation. You can edit the visual and select a new dataset to use with it. If you need to consistently use a new file to replace an older one, store your data in a location that is consistently available. For example, you might store your .csv file in Amazon S3 and create an S3 dataset to use for your visuals. For more information on access files stored in S3, see [Creating a dataset using Amazon S3 files](create-a-data-set-s3.md). 

  Or you can import the data into a table, and base your visual on a query. This way, the data structures don't change, even if the data contained in them changes.
+ To centralize data management, consider creating general, multiple-purpose datasets that others can use to create their own datasets from. For more information, see [Creating a dataset using an existing dataset in Amazon Quick](create-a-dataset-existing-dataset.md).

## Editing a dataset from the Datasets page


1. From the Quick start page, choose **Data** at left.

1. On the **Data** page that opens, choose the dataset that you want to edit, and then choose **Edit dataset** at upper right.

   The data preparation page opens. For more information about the types of edits you can make to datasets, see [Preparing data in Amazon Quick Sight](preparing-data.md).

## Editing a dataset in an analysis


Use the following procedure to edit a dataset from the analysis page.

**To edit a dataset from the analysis page**

1. In your analysis, choose the pencil icon at the top of the **Fields list** pane.

1. In **Data sets in this analysis** page that opens, choose the three dots at right of the dataset that you want to edit, and then choose **Edit**.

   The dataset opens in the data preparation page.For more information about the types of edits you can make to datasets, see [Preparing data in Amazon Quick Sight](preparing-data.md).

# Reverting datasets back to previous published versions
Reverting datasets

When you save and publish changes to a dataset in Amazon Quick Sight, a new version of the dataset is created. At any time, you can see a list of all the previous published versions of that dataset. You can also preview a specific version in that history, or even revert the dataset back to a previous version, if needed.

The following limitations apply to dataset versioning:
+ Only the most recent 1,000 versions of a dataset are shown in the publishing history, and are available for versioning.
+ After you exceed 1,000 published versions, the oldest versions are automatically removed from the publishing history, and the dataset can no longer be reverted back to them.

Use the following procedure to revert a dataset to a previous published version.

**To revert a dataset to a previous published version**

1. From the Quick start page, choose **Data**.

1. On the **Data** page, choose a dataset, and then choose **Edit dataset** at upper right.

   For more information about editing datasets, see [Editing datasets](edit-a-data-set.md).

1. On the dataset preparation page that opens, choose the **Manage** icon in the blue toolbar at upper right, and then choose **Publishing history**.

   A list of previous published versions appears at right.

1. In the **Publishing history** pane, find the version that you want and choose **Revert**.

   To preview the version before reverting, choose **Preview**.

   The dataset is reverted and a confirmation message appears. The **Publishing history** pane also updates to show the active version of the dataset.

## Troubleshooting reverting versions
Troubleshooting

Sometimes, the dataset can't be reverted to a specific version for one the following reasons:
+ The dataset uses one or more data sources that were deleted.

  If this error occurs, you can't revert the dataset to a previous version.
+ Reverting would make a calculated field not valid.

  If this error occurs, you can edit or remove the calculated field, and then save the dataset. Doing this creates a new version of the dataset.
+ One or more columns are missing in the data source.

  If this error occurs, Quick Sight shows the latest schema from the data source in the preview to reconcile differences between versions. Any calculated field, field name, field type, and filter changes shown in the schema preview are from the version that you want to revert to. You can save this reconciled schema as a new version of the dataset. Or you can return to the active (latest) version by choosing **Preview** on the top (latest) version in the publishing history.

# Duplicating datasets


You can duplicate an existing dataset to save a copy of it with a new name. The new dataset is a completely separate copy. 

The **Duplicate dataset** option is available if both of the following are true: you own the dataset and you have permission to the data source.

**To duplicate a dataset**

1. From the Quick start page, choose **Data** at left.

1. Choose the dataset that you want to duplicate.

1. On the dataset details page that opens, choose the drop-down for **Edit dataset**, and then choose **Duplicate**.

1. On the Duplicate dataset page that opens, give the duplicated dataset a name, and then choose **Duplicate**.

   The duplicated dataset details page opens. From this page, you can edit the dataset, set up a refresh schedule, and more.

# Sharing datasets


You can give other Quick Sight users and groups access to a dataset by sharing it with them. Then they can create analyses from it. If you make them co-owners, they can also refresh, edit, delete, or reshare the dataset. 

## Sharing a dataset


If you have owner permissions on a dataset, use the following procedure to share it.

**To share a dataset**

1. From the Quick start page, choose **Data** at left.

1. On the **Data** page, choose the dataset that you want to share.

1. On the dataset details page that opens, choose the **Permissions** tab, and then choose **Add users & groups**.

1. Enter the user or group that you want to share this dataset with, and then choose **Add**. You can only invite users who belong to the same Quick account.

   Repeat this step until you have entered information for everyone you want to share the dataset with.

1. For the **Permissions** column, choose a role for each user or group to give them permissions on the dataset.

   Choose **Viewer** to allow the user to create analyses and datasets from the dataset. Choose **Owner** to allow the user to do that and also refresh, edit, delete, and reshare the dataset.

   Users receive emails with a link to the dataset. Groups don't receive invitation emails.

# Viewing and editing the permissions of users that a dataset is shared with


If you have owner permissions on a dataset, you can use the following procedure to view, edit, or change user access to it. 

**To view, edit, or change user access to a dataset if you have owner permissions for it**

1. From the Quick start page, choose **Data** at left.

1. On the **Data** page, choose the dataset that you want to share.

1. On the dataset details page that opens, choose the **Permissions** tab.

   A list of all users and groups with access to the dataset is displayed.

1. (Optional) To change permission roles for a user or group, choose the drop-down menu in the **Permissions** column for the user or group. Then choose either **Viewer** or **Owner**.

# Revoking access to a dataset


If you have owner permissions on a dataset, you can use the following procedure to revoke user access to a dataset.

**To revoke user access to a dataset if you have owner permissions for it**

1. From the Quick start page, choose **Data** at left.

1. On the **Data** page, choose the dataset that you want to share.

1. On the dataset details page that opens, choose the **Permissions** tab.

   A list of all users and groups with access to the dataset is displayed.

1. In the **Actions** column for the user or group, choose **Revoke access**.

# Tracking dashboards and analyses that use a dataset
Tracking dataset assets

When you create a dataset in Quick Sight, you can track which dashboards and analyses use that dataset. This approach is useful when you want to see which resources will be affected when you make changes to a dataset, or want to delete a dataset. 

Use the following procedure to see which dashboards and analyses use a dataset.

**To track resources that use a dataset**

1. From the Quick start page, choose **Data** in the pane at left.

1. On the **Data** page, choose the dataset that you want to track resources for.

1. In the page that opens for that dataset, choose **Edit dataset**.

1. In the data preparation page that opens, choose **Manage** at upper right, and then choose **Usage**.

1. The dashboards and analyses that use the dataset are listed in the pane that opens.

# Using dataset parameters in Amazon Quick
Dataset parameters

In Amazon Quick, authors can use dataset parameters in direct query to dynamically customize their datasets and apply reusable logic to their datasets. A *dataset parameter* is a parameter created at the dataset level. It's consumed by an analysis parameter through controls, calculated fields, filters, actions, URLs, titles, and descriptions. For more information on analysis parameters, see [Parameters in Amazon Quick](parameters-in-quicksight.md). The following list describes three actions that can be performed with dataset parameters:
+  **Custom SQL in direct query** – Dataset owners can insert dataset parameters into the custom SQL of a direct query dataset. When these parameters are applied to a filter control in a Quick analysis, users can filter their custom data faster and more efficiently.
+ **Repeatable variables** – Static values that appear in multiple locations in the dataset page can be modified in one action using custom dataset parameters.
+ **Move calculated fields to datasets** – Quick authors can copy calculated fields with parameters in an analysis and migrate them to the dataset level. This protects calculated fields at the analysis level from being accidentally modified and calculated fields be shared across multiple analyses.

In some situations, dataset parameters improve filter control performance for direct query datasets that require complex custom SQL and simplify business logic at the dataset level.

**Topics**
+ [

## Dataset parameter limitations
](#dataset-parameters-limitations)
+ [

# Creating dataset parameters in Amazon Quick
](dataset-parameters-SQL.md)
+ [

# Inserting dataset parameters into custom SQL
](dataset-parameters-insert-parameter.md)
+ [

# Adding dataset parameters to calculated fields
](dataset-parameters-calculated-fields.md)
+ [

# Adding dataset parameters to filters
](dataset-parameters-dataset-filters.md)
+ [

# Using dataset parameters in Quick analyses
](dataset-parameters-analysis.md)
+ [

# Advanced use cases of dataset parameters
](dataset-parameters-advanced-options.md)

## Dataset parameter limitations


This section covers known limitations that you might encounter when working with dataset parameters in Amazon Quick.
+ When dashboard readers schedule emailed reports, selected controls don't propagate to the dataset parameters that are included in the report that's attached to the email. Instead, the default values of the parameters are used.
+ Dataset parameters can't be inserted into custom SQL of datasets stored in SPICE.
+ Dynamic defaults can only be configured on the analysis page of the analysis that is using the dataset. You can't configure a dynamic default at the dataset level.
+ The **Select all** option is not supported on multivalue controls of analysis parameters that are mapped to dataset parameters.
+ Cascading controls are not supported for dataset parameters.
+ Dataset parameters can only be used by dataset filters when the dataset is using direct query.
+ In a custom SQL query, only 128 dataset parameters can be used.

# Creating dataset parameters in Amazon Quick
Creating dataset parameters

Use the following procedures to get started using dataset parameters.

**To create a new dataset parameter**

1. From the Quick start page, choose **Data** on the left, choose the ellipsis (three dots) next to the dataset that you want to change, and then choose **Edit**.

1. On the **Dataset** page that opens, choose **Parameters** on the left, and then choose the (\$1) icon to create a new dataset parameter.

1. In the **Create new parameter** pop-up that appears, enter a parameter name in the **Name** box.

1. In the **Data type** dropdown, choose the parameter data type that you want. Supported data types are `String`, `Integer`, `Number`, and `Datetime`. This option can't be changed after the parameter is created.

1. For **Default value**, enter the default value that you want the parameter to have.
**Note**  
When you map a dataset parameter to an analysis parameter, a different default value can be chosen. When this happens, the default value configured here is overridden by the new default value.

1. For **Values**, choose the value type that you want the parameter to have. **Single value** parameters support single–select dropdowns, text field, and list controls. **Multiple values** parameters support multi–select dropdown controls. This option can't be changed after the parameter is created.

1. When you are finished configuring the new parameter, choose **Create** to create the parameter.

# Inserting dataset parameters into custom SQL


You can insert dataset parameters into the custom SQL of a dataset in direct query mode by referencing it with `<<$parameter_name>>` in the SQL statement. At runtime, dashboard users can enter filter control values that are associated with a dataset parameter. Then, they can see the results in the dashboard visuals after the values propagate to the SQL query. You can use parameters to create basic filters based on customer input in `where` clauses. Alternatively, you could add `case when` or `if else` clauses to dynamically change the logic of the SQL query based on a parameter's input.

For example, say you want to add a `WHERE` clause to your custom SQL that filters data based on an end user's Region name. In this case, you create a single value parameter called `RegionName`:

```
SELECT *
FROM transactions
WHERE region = <<$RegionName>>
```

You can also let users provide multiple values to the parameter:

```
SELECT *
FROM transactions
WHERE region in (<<$RegionNames>>)
```

In the following more complex example, a dataset author refers to two dataset parameters twice based on a user's first and last names that can be selected in a dashboard filter control:

```
SELECT Region, Country, OrderDate, Sales
FROM transactions
WHERE region=
(Case
WHEN <<$UserFIRSTNAME>> In 
    (select firstname from user where region='region1') 
    and <<$UserLASTNAME>> In 
    (select lastname from user where region='region1') 
    THEN 'region1'
WHEN <<$UserFIRSTNAME>> In 
    (select firstname from user where region='region2') 
    and <<$UserLASTNAME>> In 
    (select lastname from user where region='region2') 
    THEN 'region2'
ELSE 'region3'
END)
```

You can also use parameters in `SELECT` clauses to create new columns in a dataset from user input:

```
SELECT Region, Country, date, 
    (case 
    WHEN <<$RegionName>>='EU'
    THEN sum(sales) * 0.93   --convert US dollar to euro
    WHEN <<$RegionName>>='CAN'
    THEN sum(sales) * 0.78   --convert US dollar to Canadian Dollar
    ELSE sum(sales) -- US dollar
    END
    ) as "Sales"
FROM transactions
WHERE region = <<$RegionName>>
```

To create a custom SQL query or to edit an existing query before adding a dataset parameter, see [Using SQL to customize data](adding-a-SQL-query.md).

When you apply custom SQL with a dataset parameter, `<<$parameter_name>>` is used as a placeholder value. When a user chooses one of the parameter values from a control, Quick replaces the placeholder with the values that the user selects on the dashboard.

In the following example, the user enters a new custom SQL query that filters data by state:

```
select * from all_flights
where origin_state_abr = <<$State>>
```

The default value of the parameter is applied to the SQL query and the results appear in the **Preview pane**.

# Adding dataset parameters to calculated fields


You can also add dataset parameters to calculated field expressions using the format `${parameter_name}`.

When you create a calculation, you can choose from the existing parameters from the list of parameters under the **Parameters** list. You can't create a calculated field that contains a multivalued parameter.

For more information on adding calculated fields, see [Using calculated fields with parameters in Amazon Quick](parameters-calculated-fields.md).

# Adding dataset parameters to filters


For datasets in direct query mode, dataset authors can use dataset parameters in filters without custom SQL. Dataset parameters can't be added to filters if the dataset is in SPICE.

**To add a dataset parameter to a filter**

1. Open the dataset page of the dataset that you want to create a filter for. Choose **Filters** on the left, and then choose **Add filter**.

1. Enter the name that you want the filter to have and choose the field that you want filtered in the dropdown.

1. After you create the new filter, navigate to the filter in the **Filters** pane, choose the ellipsis (three dots) next to the filter, and then choose **Edit**.

1. For **Filter type**, choose **Custom filter**.

1. For **Filter condition**, choose the condition that you want.

1. Select the **Use parameter** box and choose the dataset parameter that you want the filter to use.

1. When you are finished making changes, choose **Apply**.

# Using dataset parameters in Quick analyses


Once you create a dataset parameter, after you add the dataset to an analysis, map the dataset parameter to a new or existing analysis parameter. After you map a dataset parameter to an analysis parameter, you can use them with filters, controls, and any other analysis parameter feature.

You can manage your dataset parameters in the **Parameters** pane of the analysis that is using the dataset that the parameters belong to. In the **Dataset Parameters** section of the **Parameters** pane, you can choose to see only the unmapped dataset parameters (default). Alternatively, you can choose to see all mapped and unmapped dataset parameters by choosing **ALL** from the **Viewing** dropdown.

## Mapping dataset parameters in new Quick analyses


When you create a new analysis from a dataset that contains parameters, you need to map the dataset parameters to the analysis before you can use them. This is also true when you add a dataset with parameters to an analysis. You can view all unmapped parameters in an analysis in the **Parameters** pane of the analysis. Alternatively, choose **VIEW** in the notification message that appears in the top right of the page when you create the analysis or add the dataset.

**To map a dataset parameter to an analysis parameter**

1. Open the [Quick console](https://quicksight.aws.amazon.com/).

1. Choose the analysis that you want to change.

1. Choose the **Parameters** icon to open the **Parameters** pane.

1. Choose the ellipsis (three dots) next to the dataset parameter that you want to map, choose **Map Parameter**, and then choose the analysis parameter that you want to map your dataset parameter to.

   If your analysis doesn't have any analysis parameters, you can choose **Map parameter** and **Create new** to create an analysis parameter that is automatically mapped to the dataset parameter upon creation.

   1. (Optional) In the **Create new parameter** pop-up that appears, for **Name**, enter a name for the new analysis parameter.

   1. (Optional) For **Static default value**, choose the static default value that you want the parameter to have.

   1. (Optional) Choose **Set a dynamic default** to set a dynamic default for the new parameter.

   1. In the **Mapped dataset parameters** table, you will see the dataset parameter that you are mapping to the new analysis parameter. You can add other dataset parameters to this analysis parameter by choosing the **ADD DATASET PARAMETER** dropdown and then choosing the parameter that you want to map. You can unmap a dataset parameter by choosing the **Remove** button next to the dataset parameter that you want to remove.

   For more information on creating analysis parameters, see [Setting up parameters in Amazon Quick](parameters-set-up.md).

When you map a dataset parameter to an analysis parameter, the analysis parameter represents the dataset parameter wherever it is used in the analysis.

You can also map and unmap dataset parameters to analysis parameters in the **Edit parameter** window. To open the **Edit parameter** window, navigate to the **Parameters** pane, choose the ellipsis (three dots) next to the analysis parameter that you want to change, and then choose **Edit parameter**. You can add other dataset parameters to this analysis parameter by choosing the **ADD DATASET PARAMETER** dropdown and then choosing the parameter that you want to map. You can unmap a dataset parameter by choosing the **Remove** button next to the dataset parameter that you want to remove. You can also remove all mapped dataset parameters by choosing **REMOVE ALL**. When you are done making changes, choose **Update**.

When you delete an analysis parameter, all dataset parameters are unmapped from the analysis and appear in the **UNMAPPED** section of the **Parameters** pane. You can only map a dataset parameter to one analysis parameter at a time. To map a dataset parameter to a different analysis parameter, unmap the dataset parameter and then map it to the new analysis parameter.

## Adding filter controls to mapped analysis parameters


After you map a dataset parameter to an analysis parameter in Quick, you can create filter controls for filters, actions, calculated fields, titles, descriptions, and URLs.

**To add a control to a mapped parameter**

1. In the **Parameters** pane of the analysis page, choose the ellipsis (three dots) next to the mapped analysis parameter that you want, and then choose **Add control**.

1. In the **Add control** window that appears, enter the **Name** that you want and choose the **Style** that you want the control to have. For single value controls, choose between `Dropdown`, `List`, and `Text field`. For multivalue controls, choose `Dropdown`.

1. Choose **Add** to create the control.

# Advanced use cases of dataset parameters
Advanced use

This section covers more advanced options and use cases working with dataset parameters and dropdown controls. Use the following walkthroughs to create dynamic dropdown values with dataset parameters.

## Using multivalue controls with dataset parameters


When you use dataset parameters that are inserted into the custom SQL of a dataset, the dataset parameters commonly filter data by values from a specific column. If you create a dropdown control and assign the parameter as the value, the dropdown only shows the value that the parameter filtered. The following procedure shows how you can create a control that is mapped to a dataset parameter and shows all unfiltered values.

**To populate all assigned values in a dropdown control**

1. Create a new single–column dataset in SPICE or direct query that includes all unique values from the original dataset. For example, let's say that your original dataset is using the following custom SQL:

   ```
   select * from all_flights
           where origin_state_abr = <<$State>>
   ```

   To create a single–column table with all unique origin states, apply the following custom SQL to the new dataset:

   ```
   SELECT distinct origin_state_abr FROM all_flights
           order by origin_state_abr asc
   ```

   The SQL expression returns all unique states in alphabetic order. The new dataset does not have any dataset parameters.

1. Enter a **Name** for the new dataset, and then save and publish the dataset. In our example, the new dataset is called `State Codes`.

1. Open the analysis that contains the original dataset, and add the new dataset to the analysis. For information on adding datasets to an existing analysis, see [Adding a dataset to an analysis](adding-a-data-set-to-an-analysis.md).

1. Navigate to the **Controls** pane and find the dropdown control that you want to edit. Choose the ellipsis (three dots) next to the control, and then choose **Edit**.

1. In the **Format control** that appears on the left, and choose **Link to a dataset field** in the **Values** section.

1. For the **Dataset** dropdown that appears, choose the new dataset that you created. In our example, the `State Codes` dataset is chosen.

1. For the **Field** dropdown that appears, choose the appropriate field. In our example, the `origin_state_abr` field is chosen.

After you finish linking the control to the new dataset, all unique values appear in the control's dropdown. These include the values that are filtered out by the dataset parameter.

## Using controls with Select all options


By default, when one or more dataset parameters are mapped to an analysis parameter and added to a control, the `Select all` option is not available. The following procedure shows a workaround that uses the same example scenario from the previous section.

**Note**  
This walkthrough is for datasets that are small enough to load in direct query. If you have a large dataset and want to use the `Select All` option, it is recommended that you load the dataset into SPICE. However, if you want to use the `Select All` option with dataset parameters, this walkthrough describes a way to do so.

To begin, let's say you have a direct query dataset with custom SQL that contains a multivalue parameter called `States`:

```
select * from all_flights
where origin_state_abr in (<<$States>>)
```

**To use the Select all option in a control that uses dataset parameters**

1. In the **Parameters** pane of the analysis, find the dataset parameter that you want to use and choose **Edit** from the ellipsis (three dots) next to the parameter.

1. In the **Edit parameter** window that appears, enter a new default value in the **Static multiple default values** section. In our example, the default value is ` All States`. Note that the example uses a leading space character so that the default value appears as the first item in the control.

1. Choose **Update** to update the parameter.

1. Navigate to the dataset that contains the dataset parameter that you're using in the analysis-by-analysis. Edit the custom SQL of the dataset to include a default use case for your new static multiple default values. Using the ` All States` example, the SQL expression appears as follows:

   ```
   select * from public.all_flights
   where
       ' All States' in (<<$States>>) or
       origin_state_abr in (<<$States>>)
   ```

   If the user chooses ` All States` in the control, the new SQL expression returns all unique records. If the user chooses a different value from the control, the query returns values that were filtered by the dataset parameter.

### Using controls with Select all and multivalue options


You can combine the previous `Select all` procedure with the multivalue control method discussed earlier to create dropdown controls that contain a `Select all` value in addition to multiple values that the user can select. This walkthrough assumes that you have followed the previous procedures, that you know how to map dataset parameters to analysis parameters, and that you can create controls in an analysis. For more information on mapping analysis parameters, see [Mapping dataset parameters in new Quick analyses](dataset-parameters-analysis.md#dataset-parameters-map-to-analysis). For more information on creating controls in an analysis that is using dataset parameters, see [Adding filter controls to mapped analysis parameters](dataset-parameters-analysis.md#dataset-parameters-analysis-filter-control).

**To add multiple values to a control with a Select all option and a mapped dataset parameter**

1. Open the analysis that has the original dataset with a `Select all` custom SQL expression and a second dataset that includes all possible values of the filtered column that exists in the original dataset.

1. Navigate to the secondary dataset that was created earlier to return all values of a filtered column. Add a custom SQL expression that adds your previously configured `Select all` option to the query. The following example adds the ` All States` record to the top of the list of returned values of the dataset:

   ```
   (Select ' All States' as origin_state_abr)
       Union All
       (SELECT distinct origin_state_abr FROM all_flights
       order by origin_state_abr asc)
   ```

1. Go back to the analysis that the datasets belong to and map the dataset parameter that you are using to the analysis parameter that you created in step 3 of the previous procedure. The analysis parameter and dataset parameter can have the same name. In our example, the analysis parameter is called `States`.

1. Create a new filter control or edit an existing filter control and choose **Hide Select All** to hide the disabled **Select All** option that appears in multivalue controls.

Once you create the control, users can use the same control to select all or multiple values of a filtered column in a dataset.

# Using row-level security in Amazon Quick
Using row-level security


|  | 
| --- |
|  Applies to:  Enterprise Edition  | 

In the Enterprise edition of Amazon Quick, you can restrict access to a dataset by configuring row-level security (RLS) on it. You can do this before or after you have shared the dataset. When you share a dataset with RLS with dataset owners, they can still see all the data. When you share it with readers, however, they can only see the data restricted by the permission dataset rules.

Also, when you embed Amazon Quick dashboards in your application for unregistered users of Quick, you can use row-level security (RLS) to filter/restrict data with tags. A tag is a user-specified string that identifies a session in your application. You can use tags to implement RLS controls for your datasets. By configuring RLS-based restrictions in datasets, Quick filters the data based on the session tags tied to the user identity/session.

You can restrict access to a dataset using username or group-based rules, tag-based rules, or both.

Choose user-based rules if you want to secure data for users or groups provisioned (registered) in Quick. To do so, select a permissions dataset that contains rules set by columns for each user or group accessing the data. Only users or groups identified in the rules have access to data.

Choose tag-based rules only if you are using embedded dashboards and want to secure data for users not provisioned (unregistered users) in Quick. To do so, define tags on columns to secure data. Values to tags must be passed when embedding dashboards.

**Topics**
+ [

# Using row-level security with user-based rules to restrict access to a dataset
](restrict-access-to-a-data-set-using-row-level-security.md)
+ [

# Using row-level security with tag-based rules to restrict access to a dataset when embedding dashboards for anonymous users
](quicksight-dev-rls-tags.md)

# Using row-level security with user-based rules to restrict access to a dataset
Using user-based rules


|  | 
| --- |
|  Applies to:  Enterprise Edition  | 

In the Enterprise edition of Amazon Quick, you can restrict access to a dataset by configuring row-level security (RLS) on it. You can do this before or after you have shared the dataset. When you share a dataset with RLS with dataset owners, they can still see all the data. When you share it with readers, however, they can only see the data restricted by the permission dataset rules. By adding row-level security, you can further control their access.

**Note**  
When applying SPICE datasets to row-level security, each field in the dataset can contain up to 2,047 Unicode characters. Fields that contain more than this quota are truncated during ingestion. To learn more about SPICE data quotas, see [SPICE quotas for imported data](data-source-limits.md#spice-limits).

To do this, you create a query or file with one column for user or group identification. You can use either `UserName` and `GroupName`, or alternatively `UserARN` and `GroupARN`. You can think of this as *adding a rule* for that user or group. Then you can add one column to the query or file for each field that you want to grant or restrict access to. For each user or group name that you add, you add the values for each field. You can use NULL (no value) to mean all values. To see examples of dataset rules, see [Creating dataset rules for row-level security](#create-data-set-rules-for-row-level-security).

To apply the dataset rules, you add the rules as a permissions dataset to your dataset. Keep in mind the following points:
+ The permissions dataset can't contain duplicate values. Duplicates are ignored when evaluating how to apply the rules.
+ Each user or group specified can see only the rows that *match* the field values in the dataset rules. 
+ If you add a rule for a user or group and leave all other columns with no value (NULL), you grant them access to all the data. 
+ If you don't add a rule for a user or group, that user or group can't see any of the data. 
+ The full set of rule records that are applied per user must not exceed 999. This limitation applies to the total number of rules that are directly assigned to a username, plus any rules that are assigned to the user through group names. 
+ If a field includes a comma (,) Amazon Quick treats each word separated from another by a comma as an individual value in the filter. For example, in `('AWS', 'INC')`, `AWS,INC` is considered as two strings: `AWS` and `INC`. To filter with `AWS,INC`, wrap the string with double quotation marks in the permissions dataset. 

  If the restricted dataset is a SPICE dataset, the number of filter values applied per user can't exceed 192,000 for each restricted field. This applies to the total number of filter values that are directly assigned to a username, plus any filter values that are assigned to the user through group names.

  If the restricted dataset is a direct query dataset, the number of filter values applied per user varies from data sources.

  Exceeding the filter value limit may cause the visual rendering to fail. We recommend adding an additional column to your restricted dataset to divide the rows into groups based on the original restricted column so that the filter list can be shortened.

Amazon Quick treats spaces as literal values. If you have a space in a field that you are restricting, the dataset rule applies to those rows. Amazon Quick treats both NULLs and blanks (empty strings "") as "no value". A NULL is an empty field value. 

Depending on what data source your dataset is coming from, you can configure a direct query to access a table of permissions. Terms with spaces inside them don't need to be delimited with quotes. If you use a direct query, you can easily change the query in the original data source. 

Or you can upload dataset rules from a text file or spreadsheet. If you are using a comma-separated value (CSV) file, don't include any spaces on the given line. Terms with spaces inside them need to be delimited with quotation marks. If you use dataset rules that are file-based, apply any changes by overwriting the existing rules in the dataset's permissions settings.

Datasets that are restricted are marked with the word **RESTRICTED** in the **Data** screen.

Child datasets that are created from a parent dataset that has RLS rules active retain the same RLS rules that the parent dataset has. You can add more RLS rules to the child dataset, but you can't remove the RLS rules that the dataset inherits from the parent dataset. 

Child datasets that are created from a parent dataset that has RLS rules active can only be created with Direct Query. Child datasets that inherit the parent dataset's RLS rules aren't supported in SPICE.

Row-level security works only for fields containing textual data (string, char, varchar, and so on). It doesn't currently work for dates or numeric fields. Anomaly detection is not supported for datasets that use row-level security (RLS).

## Creating dataset rules for row-level security


Use the following procedure to create a permissions file or query to use as dataset rules.

**To create a permissions files or query to use as dataset rules**

1. Create a file or a query that contains the dataset rules (permissions) for row-level security. 

   It doesn't matter what order the fields are in. However, all the fields are case-sensitive. Make sure that they exactly match the field names and values. 

   The structure should look similar to one of the following. Make sure that you have at least one field that identifies either users or groups. You can include both, but only one is required, and only one is used at a time. The field that you use for users or groups can have any name you choose.
**Note**  
If you are specifying groups, use only Amazon Quick groups or Microsoft AD groups. 

   The following example shows a table with groups.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/restrict-access-to-a-data-set-using-row-level-security.html)

   The following example shows a table with usernames.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/restrict-access-to-a-data-set-using-row-level-security.html)

   The following example shows a table with user and group Amazon Resource Names (ARNs).    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/restrict-access-to-a-data-set-using-row-level-security.html)

   Or if you use a .csv file, the structure should look similar to one of the following.

   ```
   UserName,SalesRegion,Segment
   AlejandroRosalez,EMEA,"Enterprise,SMB,Startup"
   MarthaRivera,US,Enterprise
   NikhilJayashankars,US,SMB
   PauloSantos,US,Startup
   SaanviSarkar,APAC,"SMB,Startup"
   sales-tps@example.com,"",""
   ZhangWei,APAC-Sales,"Enterprise,Startup"
   ```

   ```
   GroupName,SalesRegion,Segment
   EMEA-Sales,EMEA,"Enterprise,SMB,Startup"
   US-Sales,US,Enterprise
   US-Sales,US,SMB
   US-Sales,US,Startup
   APAC-Sales,APAC,"SMB,Startup"
   Corporate-Reporting,"",""
   APAC-Sales,APAC,"Enterprise,Startup"
   ```

   ```
   UserARN,GroupARN,SalesRegion
   arn:aws:quicksight:us-east-1:123456789012:user/Bob,arn:aws:quicksight:us-east-1:123456789012:group/group-1,APAC
   arn:aws:quicksight:us-east-1:123456789012:user/Sam,arn:aws:quicksight:us-east-1:123456789012:group/group-2,US
   ```

   Following is a SQL example.

   ```
   /* for users*/
   	select User as UserName, SalesRegion, Segment
   	from tps-permissions;
   
   	/* for groups*/
   	select Group as GroupName, SalesRegion, Segment
   	from tps-permissions;
   ```

1. Create a dataset for the dataset rules. To make sure that you can easily find it, give it a meaningful name, for example **Permissions-Sales-Pipeline**.

## Rules Dataset flagging for row-level security


Use the following procedure to appropriately flag a dataset as a rules dataset.

Rules Dataset is a flag that distinguishes permission datasets used for row-level security from regular datasets. If a permissions dataset was applied to a regular dataset before March 31, 2025, it will have a Rules Dataset flag in the **Dataset** landing page. 

If a permissions dataset was not applied to a regular dataset by March 31, 2025, it will be categorized as a regular dataset. To use it as a rules dataset, duplicate the permissions dataset and flag it as a rules dataset on the console when creating the dataset. Select EDIT DATASET and under the options, choose DUPLICATE AS RULES DATASET. 

To successfully duplicate it as a rules dataset, ensure the original dataset has: 1. Required user metadata or group metadata column(s) and 2. Only string type columns.

To create a new rules dataset on the console, select NEW RULES DATASET under the NEW DATASET dropdown. When creating a rules dataset programmatically, add the following parameter: [UseAs: RLS\$1RULES](https://docs.aws.amazon.com/quicksight/latest/APIReference/API_CreateDataSet.html#API_CreateDataSet_RequestSyntax). This is an optional parameter that is only used to create a rules dataset. Once a dataset has been created, either through the console or programmatically, and flagged as either a rules dataset or a regular dataset, it cannot be changed.

Once datasets are flagged as rules datasets, Amazon Quick will apply strict SPICE ingestion rules on them. To ensure data integrity, SPICE ingestions for rules datasets will fail if there are invalid rows or cells exceeding length limits. You must fix the ingestion issues in order to re-initiate a successful ingestion. Strict ingestion rules are only applicable to rules datasets. Regular datasets will not have dataset ingestion failures when there are skipped rows or string truncations. 

## Applying row-level security


Use the following procedure to apply row-level security (RLS) by using a file or query as a dataset that contains the rules for permissions. 

**To apply row-level security by using a file or query**

1. Confirm that you have added your rules as a new dataset. If you added them, but don't see them under the list of datasets, refresh the screen.

1. On the **Data** page, choose the dataset

1. On the dataset details page that opens, for **Row-level security**, choose **Set up**.

1. On the **Set up row-level security** page that opens, choose **User-based rules**.

1. From the list of datasets that appears, choose your permissions dataset. 

   If your permissions dataset doesn't appear on this screen, return to your datasets, and refresh the page.

1. For **Permissions policy** choose **Grant access to dataset**. Each dataset has only one active permissions dataset. If you try to add a second permissions dataset, it overwrites the existing one.
**Important**  
Some restrictions apply to NULL and empty string values when working with row-level security:  
If your dataset has NULL values or empty strings ("") in the restricted fields, these rows are ignored when the restrictions are applied. 
Inside the permissions dataset, NULL values and empty strings are treated the same. For more information, see the following table.
To prevent accidentally exposing sensitive information, Amazon Quick skips empty RLS rules that grant access to everyone. An *empty RLS rule* occurs when all columns of a row have no value. Quick RLS treats NULL, empty strings (""), or empty comma separated strings (for example ",,,") as no value.  
After skipping empty rules, other nonempty RLS rules still apply.
If a permission dataset has only empty rules and all of them were skipped, no one will have access to any data restricted by this permission dataset.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/quick/latest/userguide/restrict-access-to-a-data-set-using-row-level-security.html)

   Anyone whom you shared your dashboard with can see all the data in it, unless the dataset is restricted by dataset rules. 

1. Choose **Apply dataset** to save your changes. Then, on the **Save data set rules?** page, choose **Apply and activate**. Changes in permissions apply immediately to existing users. 

1. (Optional) To remove permissions, first remove the dataset rules from the dataset. 

   Make certain that the dataset rules are removed. Then, choose the permissions dataset and choose **Remove data set**.

   To overwrite permissions, choose a new permissions dataset and apply it. You can reuse the same dataset name. However, make sure to apply the new permissions in the **Permissions** screen to make these permissions active. SQL queries dynamically update, so these can be managed outside of Amazon Quick. For queries, the permissions are updated when the direct query cache is automatically refreshed.

If you delete a file-based permissions dataset before you remove it from the target dataset, restricted users can't access the dataset. While the dataset is in this state, it remains marked as **RESTRICTED**. However, when you view **Permissions** for that dataset, you can see that it has no selected dataset rules. 

To fix this, specify new dataset rules. Creating a dataset with the same name is not enough to fix this. You must choose the new permissions dataset on the **Permissions** screen. This restriction doesn't apply to direct SQL queries.

# Using row-level security with tag-based rules to restrict access to a dataset when embedding dashboards for anonymous users
Using tag-based rules


|  | 
| --- |
|  Applies to:  Enterprise Edition  | 


|  | 
| --- |
|    Intended audience:  Amazon Quick Administrators and Amazon Quick developers  | 

When you embed Amazon Quick dashboards in your application for users who are not provisioned (registered) in Quick, you can use row-level security (RLS) to filter/restrict data with tags. A tag is a user-specified string that identifies a session in your application. You can use tags to implement RLS controls for your datasets. By configuring RLS-based restrictions in datasets, Quick filters the data based on the session tags tied to the user identity/session.

For example, let's say you're a logistics company that has a customer-facing application for various retailers. Thousands of users from these retailers access your application to see metrics related to how their orders are getting shipped from your warehouse. 

You don't want to manage thousands of users in Quick, so you use anonymous embedding to embed the selected dashboards in your application that your authenticated and authorized users can see. However, you want to make sure retailers see only data that is for their business and not for others. You can use RLS with tags to make sure your customers only see data that's relevant to them.

To do so, complete the following steps:

1. Add RLS tags to a dataset.

1. Assign values to those tags at runtime using the `GenerateEmbedUrlForAnonymousUser` API operation.

   For more information about embedding dashboards for anonymous users using the `GenerateEmbedUrlForAnonymousUser` API operation, see [Embedding Amazon Quick Sight dashboards for anonymous (unregistered) users](embedded-analytics-dashboards-for-everyone.md).

Before you can use RLS with tags, keep in mind the following points:
+ Using RLS with tags is currently only supported for anonymous embedding, specifically for embedded dashboards that use the `GenerateEmbedUrlForAnonymousUser` API operation.
+ Using RLS with tags isn't supported for embedded dashboards that use the `GenerateEmbedURLForRegisteredUser` API operation or the old `GetDashboardEmbedUrl` API operation.
+ RLS tags aren't supported with AWS Identity and Access Management (IAM) or the Quick identity type.
+ When applying SPICE datasets to row-level security, each field in the dataset can contain up to 2,047 Unicode characters. Fields that contain more than this quota are truncated during ingestion. To learn more about SPICE data quotas, see [SPICE quotas for imported data](data-source-limits.md#spice-limits).

## Step 1: Add RLS tags to a dataset


You can add tag-based rules to a dataset in Amazon Quick. Alternatively, you can call the `CreateDataSet` or `UpdateDataSet` API operation and add tag-based rules that way. For more information, see [Add RLS tags to a dataset using the API](#quicksight-dev-rls-tags-add-api).

Use the following procedure to add RLS tags to a dataset in Quick.

**To add RLS tags to a dataset**

1. From the Quick start page, choose **Data** at left.

1. Choose the dataset that you want to add RLS to.

1. On the dataset details page that opens, for **Row-level security**, choose **Set up**.

1. On the **Set up row-level security** page that opens, choose **Tag-based rules**.

1. For **Column**, choose a column that you want to add tag rules to.

   For example, in the case for the logistics company, the `retailer_id` column is used.

   Only columns with a string data type are listed.

1. For **Tag**, enter a tag key. You can enter any tag name that you want.

   For example, in the case for the logistics company, the tag key `tag_retailer_id` is used. Doing this sets row-level security based on the retailer that's accessing the application.

1. (Optional) For **Delimiter**, choose a delimiter from the list, or enter your own.

   You can use delimiters to separate text strings when assigning more than one value to a tag. The value for a delimiter can be 10 characters long, at most.

1. (Optional) For **Match all**, choose the **\$1**, or enter your own character or characters.

   This option can be any character that you want to use when you want to filter by all the values in that column in the dataset. Instead of listing the values one by one, you can use the character. If this value is specified, it can be at least one character, or at most 256 characters long.

1. Choose **Add**.

   The tag rule is added to the dataset and is listed at the bottom, but it isn't applied yet. To add another tag rule to the dataset, repeat steps 5–9. To edit a tag rule, choose the pencil icon that follows the rule. To delete a tag rule, choose the delete icon that follows the rule. You can add up to 50 tags to a dataset.

1. When you're ready to apply the tag rules to the dataset, choose **Apply rules**.

1. On the **Turn on tag-based security?** page that opens, choose **Apply and activate**.

   The tag-based rules are now active. On the **Set up row-level security**page, a toggle appears for you to turn tag rules on and off for the dataset.

   To turn off all tag-based rules for the dataset, switch the **Tag-Based rules** toggle off, and then enter "confirm" in the text box that appears.

   On the **Data** page, a lock icon appears in the dataset row to indicate that tag rules are enabled.

   You can now use tag rules to set tag values at runtime, described in [Step 2: Assign values to RLS tags at runtime](#quicksight-dev-rls-tags-assign-values). The rules only affect Quick readers when active.
**Important**  
After tags are assigned and enabled on the dataset, make sure to give Quick authors permissions to see any of the data in the dataset when authoring a dashboard.   
To give Quick authors permission to see data in the dataset, create a permissions file or query to use as dataset rules. For more information, see [Creating dataset rules for row-level security](restrict-access-to-a-data-set-using-row-level-security.md#create-data-set-rules-for-row-level-security).

After you create a tag-based rule, a new **Manage rules** table appears that shows how your tag-based rules relate to each other. To make changes to the rules listed in the **Manage rules** table, choose the pencil icon that follows the rule. Then add or remove tags, and choose **Update**. To apply your updated rule to the dataset, choose **Apply**.

### (Optional) Add the OR condition to RLS tags


You can also add the OR condition to your tag-based rules to further customize the way data is presented to your Quick account users. When you use the OR condition with your tag-based rules, visuals in Quick appear if at least one tag defined in the rule is valid.

**To add the OR condition to your tag-based rules**

1. In the **Manage rules** table, choose **Add OR condition**.

1. In the **Select tag** dropdown list that appears, choose the tag that you want to create an OR condition for. You can add up to 50 OR conditions to the **Manage rules** table. You can add multiple tags to a single column in a dataset, but at least one column tag needs to be included in a rule.

1. Choose **Update** to add the condition to your rule, then choose **Apply** to apply the updated rule to your dataset.

### Add RLS tags to a dataset using the API


Alternatively, you can configure and enable tag-based row-level security on your dataset by calling the `CreateDataSet` or `UpdateDataSet` API operation. Use the following examples to learn how.

**Important**  
When configuring session tags in the API call,  
Treat session tags as security credentials. Do not expose session tags to end users or client-side code.
Implement server-side controls. Ensure that session tags are set exclusively by your trusted backend services, not by parameters that end users can modify.
Protect session tags from enumeration. Ensure that users in one tenant cannot discover or guess sessionTag values belonging to other tenants.
Review your architecture. If downstream customers or partners are allowed to call the API directly, evaluate whether those parties could specify sessionTag values for tenants they should not access.

------
#### [ CreateDataSet ]

The following is an example for creating a dataset that uses RLS with tags. It assumes the scenario of the logistics company described previously. The tags are defined in the `row-level-permission-tag-configuration` element. The tags are defined on the columns that you want to secure the data for. For more information about this optional element, see [RowLevelPermissionTagConfiguration](https://docs.aws.amazon.com/quicksight/latest/APIReference/API_RowLevelPermissionTagConfiguration.html) in the *Amazon Quick API Reference*.

```
create-data-set
		--aws-account-id <value>
		--data-set-id <value>
		--name <value>
		--physical-table-map <value>
		[--logical-table-map <value>]
		--import-mode <value>
		[--column-groups <value>]
		[--field-folders <value>]
		[--permissions <value>]
		[--row-level-permission-data-set <value>]
		[--column-level-permission-rules <value>]
		[--tags <value>]
		[--cli-input-json <value>]
		[--generate-cli-skeleton <value>]
		[--row-level-permission-tag-configuration 
	'{
		"Status": "ENABLED",
		"TagRules": 
			[
				{
					"TagKey": "tag_retailer_id",
					"ColumnName": "retailer_id",
					"TagMultiValueDelimiter": ",",
					"MatchAllValue": "*"
				},
				{
					"TagKey": "tag_role",
					"ColumnName": "role"
				}
			],
		"TagRuleConfigurations":
			[
				tag_retailer_id
			],
			[
				tag_role
			]
	}'
]
```

The tags in this example are defined in the `TagRules` part of the element. In this example, two tags are defined based on two columns:
+ The `tag_retailer_id` tag key is defined for the `retailer_id` column. In this case for the logistics company, this sets row-level security based on the retailer that's accessing the application.
+ The `tag_role` tag key is defined for the `role` column. In this case for the logistics company, this sets an additional layer of row-level security based on the role of the user accessing your application from a specific retailer. An example is `store_supervisor` or `manager`.

For each tag, you can define `TagMultiValueDelimiter` and `MatchAllValue`. These are optional.
+ `TagMultiValueDelimiter` – This option can be any string that you want to use to delimit the values when you pass them at runtime. The value can be 10 characters long, at most. In this case, a comma is used as the delimiter value.
+ `MatchAllValue` – This option can be any character that you want to use when you want to filter by all the values in that column in the dataset. Instead of listing the values one by one, you can use the character. If specified, this value can be at least one character, or at most 256 characters long. In this case, an asterisk is used as the match all value.

While configuring the tags for dataset columns, turn them on or off using the mandatory property `Status`. For enabling the tag rules use the value `ENABLED` for this property. By turning on tag rules, you can use them to set tag values at runtime, described in [Step 2: Assign values to RLS tags at runtime](#quicksight-dev-rls-tags-assign-values).

The following is an example of the response definition.

```
{
			"Status": 201,
			"Arn": "arn:aws:quicksight:us-west-2:11112222333:dataset/RLS-Dataset",
			"DataSetId": "RLS-Dataset",
			"RequestId": "aa4f3c00-b937-4175-859a-543f250f8bb2"
		}
```

------
#### [ UpdateDataSet ]

**UpdateDataSet**

You can use the `UpdateDataSet` API operation to add or update RLS tags for an existing dataset.

The following is an example of updating a dataset with RLS tags. It assumes the scenario of the logistics company described previously.

```
update-data-set
		--aws-account-id <value>
		--data-set-id <value>
		--name <value>
		--physical-table-map <value>
		[--logical-table-map <value>]
		--import-mode <value>
		[--column-groups <value>
		[--field-folders <value>]
		[--row-level-permission-data-set <value>]
		[--column-level-permission-rules <value>]
		[--cli-input-json <value>]
		[--generate-cli-skeleton <value>]
				[--row-level-permission-tag-configuration 
	'{
		"Status": "ENABLED",
		"TagRules": 
			[
				{
					"TagKey": "tag_retailer_id",
					"ColumnName": "retailer_id",
					"TagMultiValueDelimiter": ",",
					"MatchAllValue": "*"
				},
				{
					"TagKey": "tag_role",
					"ColumnName": "role"
				}
			],
		"TagRuleConfigurations":
			[
				tag_retailer_id
			],
			[
				tag_role
			]
	}'
]
```

The following is an example of the response definition.

```
{
			"Status": 201,
			"Arn": "arn:aws:quicksight:us-west-2:11112222333:dataset/RLS-Dataset",
			"DataSetId": "RLS-Dataset",
			"RequestId": "aa4f3c00-b937-4175-859a-543f250f8bb2"
		}
```

------

**Important**  
After tags are assigned and enabled on the dataset, make sure to give Quick authors permissions to see any of the data in the dataset when authoring a dashboard.   
To give Quick authors permission to see data in the dataset, create a permissions file or query to use as dataset rules. For more information, see [Creating dataset rules for row-level security](restrict-access-to-a-data-set-using-row-level-security.md#create-data-set-rules-for-row-level-security).

For more information about the `RowLevelPermissionTagConfiguration` element, see [RowLevelPermissionTagConfiguration](https://docs.aws.amazon.com/quicksight/latest/APIReference/API_RowLevelPermissionTagConfiguration.html) in the *Amazon Quick API Reference*.

## Step 2: Assign values to RLS tags at runtime


You can use tags for RLS only for anonymous embedding. You can set values for tags using the `GenerateEmbedUrlForAnonymousUser` API operation.

**Important**  
When configuring session tags in the API call,  
Treat session tags as security credentials. Do not expose session tags to end users or client-side code.
Implement server-side controls. Ensure that session tags are set exclusively by your trusted backend services, not by parameters that end users can modify.
Protect session tags from enumeration. Ensure that users in one tenant cannot discover or guess sessionTag values belonging to other tenants.
Review your architecture. If downstream customers or partners are allowed to call the API directly, evaluate whether those parties could specify sessionTag values for tenants they should not access.

The following example shows how to assign values to RLS tags that were defined in the dataset in the previous step.

```
POST /accounts/AwsAccountId/embed-url/anonymous-user
	HTTP/1.1
	Content-type: application/json
	{
		“AwsAccountId”: “string”,
		“SessionLifetimeInMinutes”: integer,
		“Namespace”: “string”, // The namespace to which the anonymous end user virtually belongs
		“SessionTags”:  // Optional: Can be used for row-level security
			[
				{
					“Key”: “tag_retailer_id”,
					“Value”: “West,Central,South”
				}
				{
					“Key”: “tag_role”,
					“Value”: “shift_manager”
				}
			],
		“AuthorizedResourceArns”:
			[
				“string”
			],
		“ExperienceConfiguration”:
			{
				“Dashboard”:
					{
						“InitialDashboardId”: “string”
						// This is the initial dashboard ID the customer wants the user to land on. This ID goes in the output URL.
					}
			}
	}
```

The following is an example of the response definition.

```
HTTP/1.1 Status
	Content-type: application/json

	{
	"EmbedUrl": "string",
	"RequestId": "string"
	}
```

RLS support without registering users in Quick is supported only in the `GenerateEmbedUrlForAnonymousUser` API operation. In this operation, under `SessionTags`, you can define the values for the tags associated with the dataset columns.

In this case, the following assignments are defined:
+ Values `West`, `Central`, and `South` are assigned to the `tag_retailer_id` tag at runtime. A comma is used for the delimiter, which was defined in `TagMultipleValueDelimiter` in the dataset. To use call values in the column, you can set the value to *\$1*, which was defined as the `MatchAllValue` when creating the tag.
+ The value `shift_manager` is assigned to the `tag_role` tag.

The user using the generated URL can view only the rows having the `shift_manager` value in the `role` column. That user can view only the value `West`, `Central`, or `South` in the `retailer_id` column.

For more information about embedding dashboards for anonymous users using the `GenerateEmbedUrlForAnonymousUser` API operation, see [Embedding Amazon Quick Sight dashboards for anonymous (unregistered) users](embedded-analytics-dashboards-for-everyone.md), or [GenerateEmbedUrlForAnonymousUser](https://docs.aws.amazon.com/quicksight/latest/APIReference/API_GenerateEmbedUrlForAnonymousUser.html) in the *Amazon Quick API Reference*

# Using column-level security to restrict access to a dataset
Using column-level security

In the Enterprise edition of Quick, you can restrict access to a dataset by configuring column-level security (CLS) on it. A dataset or analysis with CLS enabled has the restricted ![\[The lock icon for CLS.\]](http://docs.aws.amazon.com/quick/latest/userguide/images/cls-restricted-icon.png) symbol next to it. By default, all users and groups have access to the data. By using CLS, you can manage access to specific columns in your dataset.

If you use an analysis or dashboard that contains datasets with CLS restrictions that you don't have access to, you can't create, view, or edit visuals that use the restricted fields. For most visual types, if a visual has restricted columns that you don't have access to, you can't see the visual in your analysis or dashboard.

Tables and pivot tables behave differently. If a table or pivot table uses restricted columns in the **Rows** or **Columns** field wells, and you don't have access to these restricted columns, you can't see the visual in an analysis or dashboard. If a table or pivot table has restricted columns in the **Values** field well, you can see the table in an analysis or dashboard with only the values that you have access to. The values for restricted columns show as Not Authorized.

To enable column-level security on an analysis or dashboard, you need administrator access.

**To create a new analysis with CLS**

1. On the Quick start page, choose the **Analyses** tab.

1. At upper right, choose **New analysis**.

1. Choose a dataset, and choose **Column-level security**.

1. Select the columns that you want to restrict, and then choose **Next**. By default, all groups and users have access to all columns.

1. Choose who can access each column, and then choose **Apply** to save your changes.

**To use an existing analysis for CLS**

1. On the Quick start page, choose the **Data** tab.

1. On the Data page, open your dataset

1. On the dataset details page that opens, for **Column-level security**, choose **Set up**.

1. Select the columns that you want to restrict, and then choose **Next**. By default, all groups and users have access to all columns.

1. Choose who can access each column, and then choose **Apply** to save your changes.

**To create a dashboard with CLS**

1. On the Quick navigation pane, choose the **Analyses** tab.

1. Choose the analysis that you want to create a dashboard of.

1. At upper right, choose **Publish**.

1. Choose one of the following:
   + To create a new dashboard, choose **Publish new dashboard as** and enter a name for the new dashboard.
   + To replace an existing dashboard, choose **Replace an existing dashboard** and choose the dashboard from the list.

   Additionally, you can choose **Advanced publish options**. For more information, see [Publishing dashboards](creating-a-dashboard.md).

1. Choose **Publish dashboard**.

1. (Optional) Do one of the following:
   + To publish a dashboard without sharing, choose **x** at the upper right of the **Share dashboard with users** screen when it appears. You can share the dashboard later by choosing **Share** from the application bar.
   + To share the dashboard, follow the procedure in [Sharing Amazon Quick Sight dashboards](sharing-a-dashboard.md).

# Running queries as an IAM role in Amazon Quick
Running queries as an IAM role

You can enhance data security by using fine-grained access policies rather than broader permissions for data sources connected to Amazon Athena, Amazon Redshift or Amazon S3. You start by creating an AWS Identity and Access Management (IAM) role with permissions to be activated when a person or an API starts a query. Then, an Quick administrator or a developer assigns the IAM Role to an Athena or Amazon S3 data source. With the role in place, any person or API that runs the query has the exact permissions necessary to run the query. 

Here are some things to consider before you commit to implementing run-as roles to enhance data security: 
+ Articulate how the additional security works to your advantage.
+ Work with your Quick administrator to learn if adding roles to data sources helps you to better meet your security goals or requirements. 
+ Ask if this type of security, for the number of data sources and people and applications involved, can be feasibly documented and maintained by your team? If not, then who will undertake that part of the work?
+ In a structured organization, locate stakeholders in parallel teams in Operations, Development, and IT Support. Ask for their experience, advice, and willingness to support your plan.
+ Before you launch your project, consider doing a proof of concept that involves the people who need access to the data.

The following rules apply to using run-as roles with Athena, Amazon Redshift, and Amazon S3:
+ Each data source can have only one associated RoleArn. Consumers of the data source, who typically access datasets and visuals, can generate many different types of queries. The role places boundaries on which queries work and which don't work.
+ The ARN must correspond to an IAM role in the same AWS account as the Quick instance that uses it.
+ The IAM role must have a trust relationship allowing Quick to assume the role.
+ The identity that calls Quick's APIs must have permission to pass the role before they can update the `RoleArn` property. You only need to pass the role when creating or updating the role ARN. The permissions aren't re-evaluated later on. Similarly, the permission isn't required when the role ARN is omitted.
+ When the role ARN is omitted, the Athena or Amazon S3 data source uses the account-wide role and scope-down policies.
+ When the role ARN is present, the account-wide role and any scope-down policies are both ignored. For Athena data sources, Lake Formation permissions are not ignored.
+ For Amazon S3 data sources, both the manifest file and the data specified by the manifest file must be accessible using the IAM role.
+ The ARN string needs to match an existing IAM role in the AWS account and AWS Region where the data is located and queried. 

When Quick connects to another service in AWS, it uses an IAM role. By default, this less granular version of the role is created by Quick for each service it uses, and the role is managed by AWS account administrators. When you add an IAM role ARN with a custom permissions policy, you override the broader role for your data sources that need extra protection. For more information about policies, see [Create a customer managed policy](https://docs.aws.amazon.com/IAM/latest/UserGuide/tutorial_managed-policies.html) in the IAM User Guide.

## Run queries with Athena data sources
Athena data sources

Use the API to attach the ARN to the Athena data source. To do so, add the role ARN in the [RoleArn](https://docs.aws.amazon.com/quicksight/latest/APIReference/API_RoleArn.html) property of [AthenaParameters](https://docs.aws.amazon.com/quicksight/latest/APIReference/API_AthenaParameters.html). For verification, you can see the role ARN on the **Edit Athena data source** dialog box. However, **Role ARN** is a read-only field.

To get started, you need a custom IAM role, which we demonstrate in the following example.

Keep in mind that the following code example is for learning purposes only. Use this example in a temporary development and testing environment only, and not in a production environment. The policy in this example doesn't secure any specific resource, which must be in a deployable policy. Also, even for development, you need to add your own AWS account information.

The following commands create a simple new role and attach a few policies that grant permissions to Quick.

```
aws iam create-role \
        --role-name TestAthenaRoleForQuickSight \
        --description "Test Athena Role For QuickSight" \
        --assume-role-policy-document '{
            "Version": "2012-10-17"		 	 	 ,
            "Statement": [
                {
                    "Effect": "Allow",
                    "Principal": {
                        "Service": "quicksight.amazonaws.com"
                    },
                    "Action": "sts:AssumeRole"
                }
            ]
        }'
```

After you've identified or created an IAM role to use with each data source, attach the policies by using the attach-role-policy.

```
aws iam attach-role-policy \
        --role-name TestAthenaRoleForQuickSight \
        --policy-arn arn:aws:iam::222222222222:policy/service-role/AWSQuickSightS3Policy1

    aws iam attach-role-policy \
        --role-name TestAthenaRoleForQuickSight \
        --policy-arn arn:aws:iam::aws:policy/service-role/AWSQuicksightAthenaAccess1

    aws iam attach-role-policy \
        --role-name TestAthenaRoleForQuickSight \
        --policy-arn arn:aws:iam::aws:policy/AmazonS3Access1
```



After you verify your permissions, you can use the role in Quick data sources by creating a new role or updating an existing role. When using these commands, update the AWS account ID and AWS Region to match your own. 

Remember, these example code snippets are not for production environments. AWS strongly recommends that you identify and use a set of least privilege policies for your production cases.

```
aws quicksight create-data-source
        --aws-account-id 222222222222 \
        --region us-east-1 \
        --data-source-id "athena-with-custom-role" \
        --cli-input-json '{
            "Name": "Athena with a custom Role",
            "Type": "ATHENA",
            "data sourceParameters": {
                "AthenaParameters": {
                    "RoleArn": "arn:aws:iam::222222222222:role/TestAthenaRoleForQuickSight"
                }
            }
        }'
```

## Run queries with Amazon Redshift data sources
Amazon Redshift data sources

Connect your Amazon Redshift data with the run-as role to enhance your data security with fine-grained access policies. You can create a run-as role for Amazon Redshift data sources that use a public network or a VPC connection. You specify the connection type that you want to use in the **Edit Amazon Redshift data source** dialog box. The run-as role is not supported for Amazon Redshift Serverless data sources.

To get started, you need a custom IAM role, which we demonstrate in the following example. The following commands create a sample new role and attach policies that grant permissions to Quick.

```
aws iam create-role \
--role-name TestRedshiftRoleForQuickSight \
--description "Test Redshift Role For QuickSight" \
--assume-role-policy-document '{
    "Version": "2012-10-17"		 	 	 ,
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "quicksight.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}'
```

After you identify or create an IAM role to use with each data source, attach the policies with an `attach-role-policy`. If the `redshift:GetClusterCredentialsWithIAM` permission is attached to the role that you want to use, the values for `DatabaseUser` and `DatabaseGroups` are optional.

```
aws iam attach-role-policy \
--role-name TestRedshiftRoleForQuickSight \
--policy-arn arn:aws:iam:111122223333:policy/service-role/AWSQuickSightRedshiftPolicy
    
        
aws iam create-policy --policy-name RedshiftGetClusterCredentialsPolicy1 \
--policy-document file://redshift-get-cluster-credentials-policy.json 


aws iam attach-role-policy \
--role-name TestRedshiftRoleForQuickSight \
--policy-arn arn:aws:iam:111122223333:policy/RedshiftGetClusterCredentialsPolicy1
// redshift-get-cluster-credentials-policy.json
{
    "Version": "2012-10-17"		 	 	 ,
    "Statement": [
        {
            "Sid": "RedshiftGetClusterCredentialsPolicy",
            "Effect": "Allow",
            "Action": [
                "redshift:GetClusterCredentials"
            ],
            "Resource": [
                "*"
            ]
        }
    ]
}
```

The example above creates a data source that uses the `RoleARN`, `DatabaseUser`, and `DatabaseGroups` IAM parameters. If you want to establish the connection only through the IAM `RoleARN` parameter, attach the `redshift:GetClusterCredentialsWithIAM` permission to your role, shown in the example below.

```
aws iam attach-role-policy \ 
--role-name TestRedshiftRoleForQuickSight \ 
--policy-arn arn:aws:iam:111122223333:policy/RedshiftGetClusterCredentialsPolicy1 // redshift-get-cluster-credentials-policy.json {
    "Version": "2012-10-17"		 	 	 ,
    "Statement": [ 
        {
            "Sid": "RedshiftGetClusterCredentialsPolicy", 
            "Effect": "Allow", 
            "Action": [ "redshift:GetClusterCredentialsWithIAM" ],
            "Resource": [ "*" ]
        }
    ]
}"
```

After you verify your permissions, you can use the role in Quick data sources by creating a new role or updating an existing role. When using these commands, update the AWS account ID and AWS Region to match your own.

```
aws quicksight create-data-source \
--region us-west-2 \
--endpoint https://quicksight.us-west-2.quicksight.aws.com/ \
--cli-input-json file://redshift-data-source-iam.json \
redshift-data-source-iam.json is shown as below
{
    "AwsAccountId": "AWSACCOUNTID",
    "DataSourceId": "DATSOURCEID",
    "Name": "Test redshift demo iam",
    "Type": "REDSHIFT",
    "DataSourceParameters": {
        "RedshiftParameters": {
            "Database": "integ",
            "Host": "redshiftdemocluster.us-west-2.redshift.amazonaws.com",
            "Port": 8192,
            "ClusterId": "redshiftdemocluster",
            "IAMParameters": {
                "RoleArn": "arn:aws:iam::222222222222:role/TestRedshiftRoleForQuickSight",
                "DatabaseUser": "user",
                "DatabaseGroups": ["admin_group", "guest_group", "guest_group_1"]
            }
        }
    },
    "Permissions": [
      {
        "Principal": "arn:aws:quicksight:us-east-1:AWSACCOUNTID:user/default/demoname",
        "Actions": [
          "quicksight:DescribeDataSource",
          "quicksight:DescribeDataSourcePermissions",
          "quicksight:PassDataSource",
          "quicksight:UpdateDataSource",
          "quicksight:DeleteDataSource",
          "quicksight:UpdateDataSourcePermissions"
        ]
      }
    ]
}
```

If your data source uses the VPC connection type, use the following VPC configuration.

```
{
    "AwsAccountId": "AWSACCOUNTID",
    "DataSourceId": "DATSOURCEID",
    "Name": "Test redshift demo iam vpc",
    "Type": "REDSHIFT",
    "DataSourceParameters": {
        "RedshiftParameters": {
            "Database": "mydb",
            "Host": "vpcdemo.us-west-2.redshift.amazonaws.com",
            "Port": 8192,
            "ClusterId": "vpcdemo",
            "IAMParameters": {
                "RoleArn": "arn:aws:iam::222222222222:role/TestRedshiftRoleForQuickSight",
                "DatabaseUser": "user",
                "AutoCreateDatabaseUser": true
            }
        }
    },
    "VpcConnectionProperties": { 
      "VpcConnectionArn": "arn:aws:quicksight:us-west-2:222222222222:vpcConnection/VPC Name"
    },
    "Permissions": [
      {
        "Principal": "arn:aws:quicksight:us-east-1:222222222222:user/default/demoname",
        "Actions": [
          "quicksight:DescribeDataSource",
          "quicksight:DescribeDataSourcePermissions",
          "quicksight:PassDataSource",
          "quicksight:UpdateDataSource",
          "quicksight:DeleteDataSource",
          "quicksight:UpdateDataSourcePermissions"
        ]
      }
    ]
}
```

If your data source uses the `redshift:GetClusterCredentialsWithIAM` permission and doesn't use the `DatabaseUser` or `DatabaseGroups` parameters, grant the role access to some or all tables in the schema. To see if a role has been granted `SELECT` permissions to a specific table, input the following command into the Amazon Redshift Query Editor.

```
SELECT
u.usename,
t.schemaname||'.'||t.tablename,
has_table_privilege(u.usename,t.tablename,'select') AS user_has_select_permission
FROM
pg_user u
CROSS JOIN
pg_tables t
WHERE
u.usename = 'IAMR:RoleName'
AND t.tablename = tableName
```

For more information about the `SELECT` action in the Amazon Redshift Query Editor, see [SELECT](https://docs.aws.amazon.com/redshift/latest/dg/r_SELECT_synopsis.html).

To grant `SELECT` permisions to the role, input the following command in the Amazon Redshift Query Editor.

```
GRANT SELECT ON { [ TABLE ] table_name [, ...] | ALL TABLES IN SCHEMA 
schema_name [, ...] } TO "IAMR:Rolename";
```

For more information about the `GRANT` action in the Amazon Redshift Query Editor, see [GRANT](https://docs.aws.amazon.com/redshift/latest/dg/r_GRANT.html).

## Run queries with Amazon S3 data sources
Amazon S3 data sources

Amazon S3 data sources contain a manifest file that Quick uses to find and parse your data. You can upload a JSON manifest file through the Quick console, or you can provide a URL that points to a JSON file in an S3 bucket. If you choose to provide a URL, Quick must be granted permission to access the file in Amazon S3. Use the Quick administration console to control access to the manifest file and the data that it references.

With the **RoleArn** property, you can grant access to the manifest file and the data that it references through a custom IAM role that overrides the account-wide role. Use the API to attach the ARN to the manifest file of the Amazon S3 data source. To do so, include the role ARN in the [RoleArn](https://docs.aws.amazon.com/quicksight/latest/APIReference/API_RoleArn.html) property of [S3Parameters](https://docs.aws.amazon.com/quicksight/latest/APIReference/API_S3Parameters.html). For verification, you can see the role ARN in the **Edit S3 data source** dialog box. However, **Role ARN** is a read-only field, as shown in the following screenshot.

To get started, create an Amazon S3 manifest file. Then, you can either upload it to Amazon Quick when you create a new Amazon S3 dataset or place the file into the Amazon S3 bucket that contains your data files. View the following example to see what a manifest file might look like:

```
{
    "fileLocations": [
        {
            "URIPrefixes": [
                "s3://quicksightUser-run-as-role/data/"
            ]
        }
    ],
    "globalUploadSettings": {
        "format": "CSV",
        "delimiter": ",",
        "textqualifier": "'",
        "containsHeader": "true"
    }
}
```

For instructions on how to create a manifest file, see [Supported formats for Amazon S3 manifest files](supported-manifest-file-format.md).

After you have created a manifest file and added it to your Amazon S3 bucket or uploaded it to Quick, create or update an existing role in IAM that grants `s3:GetObject` access. The following example illustrates how to update an existing IAM role with the AWS API:

```
aws iam put-role-policy \
    --role-name QuickSightAccessToS3RunAsRoleBucket \
    --policy-name GrantS3RunAsRoleAccess \
    --policy-document '{
        "Version": "2012-10-17"		 	 	 ,
        "Statement": [
            {
                "Effect": "Allow",
                "Action": "s3:ListBucket",
                "Resource": "arn:aws:s3:::s3-bucket-name"
            },
            {
                "Effect": "Allow",
                "Action": "s3:GetObject",
                "Resource": "arn:aws:s3:::s3-bucket-name/manifest.json"
            },
            {
                "Effect": "Allow",
                "Action": "s3:GetObject",
                "Resource": "arn:aws:s3:::s3-bucket-name/*"
            }
        ]
    }'
```

After your policy grants `s3:GetObject` access, you can begin creating data sources that apply the updated `put-role-policy` to the Amazon S3 data source's manifest file.

```
aws quicksight create-data-source --aws-account-id 111222333444 --region us-west-2 --endpoint https://quicksight.us-west-2.quicksight.aws.com/ \
    --data-source-id "s3-run-as-role-demo-source" \
    --cli-input-json '{
        "Name": "S3 with a custom Role",
        "Type": "S3",
        "DataSourceParameters": {
            "S3Parameters": {
                "RoleArn": "arn:aws:iam::111222333444:role/QuickSightAccessRunAsRoleBucket",
                "ManifestFileLocation": {
                    "Bucket": "s3-bucket-name", 
                    "Key": "manifest.json"
                }
            }
        }
    }'
```

After you verify your permissions, you can use the role in Quick data sources, either by creating a new role or updating an existing role. When using these commands, be sure to update the AWS account ID and AWS Region to match your own. 

# Deleting datasets


**Important**  
Currently, deleting a dataset is irreversible and can cause irreversible loss of work. Deletes don't cascade to delete dependent objects. Instead, dependent objects stop working, even if you replace the deleted dataset with an identical dataset. 

Before you delete a dataset, we strongly recommend that you first point each dependent analysis or dashboard to a new dataset. 

Currently, when you delete a dataset while dependent visuals still exist, the analyses and dashboards that contain those visuals have no way to assimilate new metadata. They remain visible, but they can't function. They can't be repaired by adding an identical dataset. 

This is because datasets include metadata that is integral to the analyses and dashboards that depend on that dataset. This metadata is uniquely generated for each dataset. Although the Quick Sight engine can read the metadata, it isn't readable by humans (for example, it doesn't contain field names). So, an exact replica of the dataset has different metadata. Each dataset's metadata is unique, even for multiple datasets that share the same name and the same fields.

**To delete a dataset**

1. Make sure that the dataset isn't being used by any analysis or dashboard that someone wants to keep using.

   On the **Data** page, choose the dataset that you no longer need. Then choose **Delete Dataset** at upper-right. 

1. If you receive a warning if this dataset is in use, track down all dependent analyses and dashboards and point them at a different dataset. If this isn't feasible, try one or more of these best practices instead of deleting it:
   + Rename the dataset, so that the dataset is clearly deprecated.
   + Filter the data, so that the dataset has no rows.
   + Remove everyone else's access to the dataset.

   We recommend that you use whatever means you can to inform owners of dependent objects that this dataset is being deprecated. Also, make sure that you provide sufficient time for them to take action.

1. After you make sure that there are no dependent objects that will stop functioning after the dataset is deleted, choose the dataset and choose **Delete Data Set**. Confirm your choice, or choose **Cancel**.

**Important**  
Currently, deleting a dataset is irreversible and can cause irreversible loss of work. Deletes don't cascade to delete dependent objects. Instead, dependent objects stop working, even if you replace the deleted dataset with an identical dataset. 

# Adding a dataset to an analysis


After you have created an analysis, you can add more datasets to the analysis. Then, you can use them to create more visuals. 

From within the analysis, you can open any dataset for editing, for example to add or remove fields, or perform other data preparation. You can also remove or replace data sets. 

The currently selected dataset displays at the top of the **Data** pane. This is the dataset that is used by the currently selected visual. Each visual can use only one dataset. Choosing a different visual changes the selected dataset to the one used by that visual.

To change the selected dataset manually, choose the dataset list at the top of the **Data** pane and then choose a different dataset. This deselects the currently selected visual if it doesn't use this dataset. Then, choose a visual that uses the selected dataset. Or choose **Add** in the **Visuals** pane to create a new visual using the selected dataset.

If you choose **Suggested** on the tool bar to see suggested visuals, you'll see visuals based on the currently selected dataset.

Only filters for the currently selected dataset are shown in the **Filter** pane, and you can only create filters on the currently selected dataset. 

**Topics**
+ [

# Replacing datasets
](replacing-data-sets.md)
+ [

# Remove a dataset from an analysis
](delete-a-data-set-from-an-analysis.md)

Use the following procedure to add a dataset to an analysis or edit a dataset used by an analysis.

**To add a dataset to an analysis**

1. On the analysis page, navigate to the **Data** pane and expand the **Dataset** dropdown.

1. Choose **Add a new dataset** to add a dataset. Or, choose **Manage datasets** to edit a dataset. For more information about editing a dataset, see [Editing datasets](edit-a-data-set.md). 

1. A list of your datasets appears. Choose a dataset and then choose **Select**. To cancel, choose **Cancel**.

# Replacing datasets


In an analysis, you can add, edit, replace, or remove datasets. Use this section to learn how to replace your dataset. 

When you replace a dataset, the new dataset should have similar columns, if you expect the visual to work the way you designed it. Replacing the dataset also clears the undo and redo history for the analysis. This means you can't use the undo and redo buttons on the application bar to navigate your changes. So, when you decide to change the dataset, your analysis design should be somewhat stable—not in the middle of an editing phase.

**To replace a dataset**

1. On the analysis page, navigate to the **Data** pane and expand the **Dataset** dropdown.

1. Choose **Manage datasets**.

1. Choose the ellipsis (three dots) next to the dataset that you want to replace, and then choose **Replace**.

1. In the **Select replacement dataset** page, choose a dataset from the list, and then choose **Select**.
**Note**  
Replacing a dataset clears the undo and redo history for this analysis. 

The dataset is replaced with the new one. The field list and visuals are updated with the new dataset. 

At this point, you can choose to add a new dataset, edit the new dataset, or replace it with a different one. Choose **Close** to exit. 

## If your new dataset doesn't match


In some cases, the selected replacement dataset doesn't contain all of the fields and hierarchies used by the visuals, filters, parameters, and calculated fields in your analysis. If so, you receive a warning from Quick Sight that shows a list of mismatched or missing columns. 

If this happens, you can update the field mapping between the two datasets. 

**To update the field mapping**

1. In the **Mismatch in replacement dataset** page, choose **Update field mapping**.

1. In the **Update field mapping** page, choose the drop-down menu for the field(s) you want to map and choose a field from the list to map it to.

   If the field is missing from the new dataset, choose **Ignore this field**.

1. Choose **Confirm** to confirm your updates.

1. Choose **Close** to close the page and return to your analysis.

The dataset is replaced with the new one. The fields list and visuals are updated with the new dataset.

Any visuals that were using a field that's now missing from the new dataset update to blank. You can readd fields to the visual or remove the visual from your analysis.

If you change your mind after replacing the dataset, you can still recover. Let's say you replace the dataset and then find that it's too difficult to change your analysis to match the new dataset. You can undo any changes you made to your analysis. You can then replace the new dataset with the original one, or with a dataset that more closely matches the requirements of the analysis. 

# Remove a dataset from an analysis


Use the following procedure to delete a dataset from an analysis.

**To delete a dataset from an analysis**

1. On the analysis page, navigate to the **Data** pane and expand the **Dataset** dropdown.

1. Choose **Manage datasets**.

1. Choose the ellipsis (three dots) next to the dataset that you want to replace, and then choose **Remove**. You can't delete a dataset if it's the only one in the analysis.

1. Choose **Close** to close the dialog box.

# Working with data sources in Amazon Quick Sight
Working with data sources

Use a data source to access an external data store. Amazon S3 data sources save the manifest file information. In contrast, Salesforce and database data sources save connection information like credentials. In such cases, you can easily create multiple datasets from the data store without having to re-enter information. Connection information isn't saved for text or Microsoft Excel files. 

**Topics**
+ [

# Creating a data source
](create-a-data-source.md)
+ [

# Editing a data source
](edit-a-data-source.md)
+ [

# Deleting a data source
](delete-a-data-source.md)

# Creating a data source



|  | 
| --- |
|    Intended audience:  Amazon Quick authors  | 

As an analysis author in Amazon Quick, you don't need to understand anything about the infrastructure that you use to connect to your data. You set up a new data source only once. 

After a data source is set up, you can access it from its tile in the Quick console. You can use it to create one or more datasets. After a dataset is set up, you can also access the dataset from its tile. By abstracting away the technical details, Amazon Quick Sight simplifies data connections. 

**Note**  
You don't need to store connection settings for files that you plan to upload manually. For more information about file uploads, see [Creating datasets](creating-data-sets.md).

Before you begin adding a new data-source connection profile to Amazon Quick, first collect the information that you need to connect to the data source. In some cases, you might plan to copy and paste settings from a file. If so, make sure that the file doesn't contain formatting characters (list bullets or numbers) or blank space characters (spaces, tabs). Also make sure that the file doesn't contain nontext "gremlin" characters such as non-ASCII, null (ASCII 0), and control characters. 

The following list includes the information to collect the most commonly used settings:
+ The data source to connect to.

  Make sure that you know which source that you need to connect to for reporting. This source might be different than the source that stores, processes, or provides access to the data. 

  For example, let's say that you're a new analyst in a large company. You want to analyze data from your ordering system, which you know uses Oracle. However, you can't directly query the online transaction processing (OLTP) data. A subset of data is extracted and stored in a bucket on Amazon S3, but you don't have access to that either. Your new co-workers explain that they use AWS Glue crawlers to read the files and AWS Lake Formation to access them. With more research, you learn that you need to use an Amazon Athena query as your data source in Amazon Quick Sight. The point here is that it isn't always obvious which type of data source to choose.
+ A descriptive name for the new data source tile.

  Each new data source connection needs a unique and descriptive name. This name displays on the Amazon Quick Sight list of existing data sources, which is at the bottom of the **Create a Data Set** screen. Use a name that makes it easy to distinguish your data sources from other similar data sources. Your new Amazon Quick Sight data source profile displays both the database software logo and the custom name that you assign.
+ The name of the server or instance to connect to.

  A unique name or other identifier identifies the server connector of the data source on your network. The descriptors vary depending on which one you're connecting to, but it's usually one or more of the following: 
  + Hostname
  + IP address
  + Cluster ID
  + Instance ID
  + Connector
  + Site-based URL
+ The name of the collection of data that you want to use.

  The descriptor varies depending on the data source, but it's usually one of the following: 
  + Database
  + Warehouse
  + S3 bucket
  + Catalog
  + Schema

  In some cases, you might need to include a manifest file or a query. 
+ The user name that you want Amazon Quick Sight to use.

  Every time Amazon Quick Sight connects using this data source profile (tile), it uses the user name from the connection settings. In some cases, this might be your personal login. But if you're going to share this with other people, ask the system administrator about creating credentials to use for Amazon Quick Sight connections. 
+ What type of connection to use. You can choose a public network or a VPC connection. If you have more than one VPC connection available, identify which one to use to reach your source of data.
+ Additional settings, such as Secure Sockets Layer (SSL) or API tokens, are required by some data sources.

After you save the connection settings as a data source profile, you can create a dataset by selecting its tile. The connections are stored as data source connection profiles in Amazon Quick Sight. 

To view your existing connection profiles, open the Quick start page, choose **Data**, choose **Create**, and then choose **New Dataset**.

For a list of supported data source connections and examples, see [Connect to your data with integrations and datasets](connecting-to-data-examples.md).

After you create a data source in Quick Sight, you can [create a dataset](https://docs.aws.amazon.com/quicksuite/latest/userguide/creating-data-sets) in Quick Sight that contains data from the connected data source. You can also [update data source connection](https://docs.aws.amazon.com/quicksuite/latest/userguide/edit-a-data-source) information at any time.

# Editing a data source


You can edit an existing database data source to update the connection information, such as the server name or the user credentials. You can also edit an existing Amazon Athena data source to update the data source name. You can't edit Amazon S3 or Salesforce data sources.

## Editing a database data source


Use the following procedure to edit a database data source.

1. From the Quick start page, choose **Data** at left. Choose **Create** and then choose **New dataset**.

1. Choose a database data source.

1. Choose **Edit Data Source**.

1. Modify the data source information:
   + If you are editing an autodiscovered database data source, you can modify any of the following settings:
     + For **Data source name**, enter a name for the data source.
     + For **Instance ID**, choose the name of the instance or cluster that you want to connect to from the list provided.
     + **Database name** shows the default database for the **Instance ID** cluster or instance. If you want to use a different database on that cluster or instance, enter its name.
     + For **UserName**, enter the user name of a user account that has permissions to do the following: 
       + Access the target database. 
       + Read (perform a `SELECT` statement on) any tables in that database that you want to use.
     + For **Password**, enter the password for the account that you entered.
   + If you are editing an external database data source, you can modify any of the following settings:
     + For **Data source name**, enter a name for the data source.
     + For **Database server**, enter one of the following values:
       + For an Amazon Redshift cluster, enter the endpoint of the cluster without the port number. For example, if the endpoint value is `clustername.1234abcd.us-west-2.redshift.amazonaws.com:1234`, then enter `clustername.1234abcd.us-west-2.redshift.amazonaws.com`. You can get the endpoint value from the **Endpoint** field on the cluster detail page on the Amazon Redshift console.
       + For an Amazon EC2 instance of PostgreSQL, MySQL, or SQL Server, enter the public DNS address. You can get the public DNS value from the **Public DNS** field on the instance detail pane in the EC2 console.
       + For a non–Amazon EC2 instance of PostgreSQL, MySQL, or SQL Server, enter the hostname or public IP address of the database server.
     + For **Port**, enter the port that the cluster or instance uses for connections.
     + For **Database name**, enter the name of the database that you want to use.
     + For **UserName**, enter the user name of a user account that has permissions to do the following: 
       + Access the target database. 
       + Read (perform a `SELECT` statement on) any tables in that database that you want to use.
     + For **Password**, enter the password for the account that you entered.

1. Choose **Validate connection**.

1. If the connection validates, choose **Update data source**. If not, correct the connection information and try validating again.

1. If you want to create a new dataset using the updated data source, proceed with the instructions at [Creating a dataset from a database](create-a-database-data-set.md). Otherwise, close the **Choose your table** dialog box.

## Editing an Athena data source


Use the following procedure to edit an Athena data source.

1. From the Quick start page, choose **Data** at left. Choose **Create** and then choose **New dataset**.

1. Choose an Athena data source.

1. Choose **Edit Data Source**.

1. For **Data source name**, enter a new name.

1. The **Manage data source sharing** screen appears. On the **Users** tab, locate the user that you want to remove. 

1. If you want to create a new dataset using the updated data source, proceed with the instructions at [Creating a dataset using Amazon Athena data](create-a-data-set-athena.md). Otherwise, close the **Choose your table** dialog box.

# Deleting a data source


You can delete a data source if you no longer need it. Deleting a query-based database data source makes any associated datasets unusable. Deleting an Amazon S3, Salesforce, or SPICE-based database data source doesn't affect your ability to use any associated datasets. This is because the data is stored in [SPICE](spice.md). However, you can no longer refresh those datasets.

**To delete a data source**

1. Choose the data source that you want to delete.

1. Choose **Delete**.