

# Connecting to Snowflake in AWS Glue Studio
<a name="connecting-to-data-snowflake"></a>

**Note**  
 You can use AWS Glue for Spark to read from and write to tables in Snowflake in AWS Glue 4.0 and later versions. To configure a Snowflake connection with AWS Glue jobs programatically, see [Redshift connections](aws-glue-programming-etl-connect-redshift-home.md). 

 AWS Glue provides built-in support for Snowflake. AWS Glue Studio provides a visual interface to connect to Snowflake, author data integration jobs, and run them on the AWS Glue Studio serverless Spark runtime. 

 AWS Glue Studio creates a unified connection for Snowflake. For more information, see [Considerations](using-connectors-unified-connections.md#using-connectors-unified-connections-considerations). 

**Topics**
+ [

# Creating a Snowflake connection
](creating-snowflake-connection.md)
+ [

# Creating a Snowflake source node
](creating-snowflake-source-node.md)
+ [

# Creating a Snowflake target node
](creating-snowflake-target-node.md)
+ [

# Set up the Authorization Code flow for Snowflake
](snowflake-setup-authorization-code-flow.md)
+ [

## Advanced options
](#creating-snowflake-connection-advanced-options)

# Creating a Snowflake connection
<a name="creating-snowflake-connection"></a>

**Note**  
 Unified connections (connection v2) standardize all connections to use `USERNAME`, `PASSWORD` keys for basic auth credentials. You can still create a v1 connection via API with secrets containing `sfUser`, `sfPassword`. 

 When adding a **Data source - Snowflake** node in AWS Glue Studio, you can choose an existing AWS Glue Snowflake connection or create a new connection. You must choose a `SNOWFLAKE` type connection and not a `JDBC` type connection configured to connect to Snowflake. Follow the following procedure to create a AWS Glue Snowflake connection:

**To create a Snowflake connection**

1. In Snowflake, generate a user, *snowflakeUser* and password, *snowflakePassword*. 

1. Determine which Snowflake warehouse this user will interact with, *snowflakeWarehouse*. Either set it as the `DEFAULT_WAREHOUSE` for *snowflakeUser* in Snowflake or remember it for the next step.

1. In AWS Secrets Manager, create a secret using your Snowflake credentials. To create a secret in Secrets Manager, follow the tutorial available in [ Create an AWS Secrets Manager secret ](https://docs.aws.amazon.com/secretsmanager/latest/userguide/create_secret.html#create_secret_cli) in the AWS Secrets Manager documentation. After creating the secret, keep the Secret name, *secretName* for the next step. 
   + When selecting **Key/value pairs**, create a pair for *snowflakeUser* with the key `sfUser`.
   + When selecting **Key/value pairs**, create a pair for *snowflakePassword* with the key `sfPassword`.
   + When selecting **Key/value pairs**, create a pair for *snowflakeWarehouse* with the key `sfWarehouse`. This is not needed if a default is set in Snowflake. 

1. In the AWS Glue Data Catalog, create a connection by following the steps in [Adding an AWS Glue connection](https://docs.aws.amazon.com//glue/latest/dg/console-connections.html). After creating the connection, keep the connection name, *connectionName*, for the next step. 
   + When selecting a **Connection type**, select Snowflake.
   + When selecting **Snowflake URL**, provide the hostname of your Snowflake instance. The URL will use a hostname in the form `account_identifier.snowflakecomputing.com`.
   + When selecting an **AWS Secret**, provide *secretName*.

# Creating a Snowflake source node
<a name="creating-snowflake-source-node"></a>

## Permissions needed
<a name="creating-snowflake-source-node-permissions"></a>

 AWS Glue Studio jobs using Snowflake data sources require additional permissions. For more information on how to add permissions to ETL jobs, see [Review IAM permissions needed for ETL jobs](https://docs.aws.amazon.com/glue/latest/ug/setting-up.html#getting-started-min-privs-job). 

 `SNOWFLAKE` AWS Glue connections use an AWS Secrets Manager secret to provide credential information. Your job and data preview roles in AWS Glue Studio must have permission to read this secret.

## Adding a Snowflake data source
<a name="creating-snowflake-source-node-add"></a>

**Prerequisites**:
+ An AWS Secrets Manager secret for your Snowflake credentials
+ A Snowflake type AWS Glue Data Catalog connection

**To add a **Data Source – Snowflake** node:**

1.  Choose the connection for your Snowflake data source. This assumes that the connection already exists and you can select from existing connections. If you need to create a connection, choose **Create Snowflake connection**. For more information, see [ Overview of using connectors and connections ](https://docs.aws.amazon.com/glue/latest/ug/connectors-chapter.html#using-connectors-overview). 

    Once you have chosen a connection, you can view the connection properties by clicking **View properties**. Information about the connection are visible, including URL, security groups, subnet, availability zone, description, and created (UTC) and last updated (UTC) timestamps. 

1.  Choose a Snowflake source option: 
   +  **Choose a single table** – this is the table that contains the data you want to access from a single Snowflake table. 
   +  **Enter custom query ** – allows you to access a dataset from multiple Snowflake tables based on your custom query. 

1.  If you chose a single table, enter the name of a Snowflake schema. 

    Or, choose **Enter custom query**. Choose this option to access a custom dataset from multiple Snowflake tables. When you choose this option, enter the Snowflake query. 

1.  In **Performance and security** options (optional), 
   +  **Enable query pushdown** – choose if you want to offload work to the Snowflake instance. 

1.  In **Custom Snowflake properties** (optional), enter parameters and values as needed. 

# Creating a Snowflake target node
<a name="creating-snowflake-target-node"></a>

## Permissions needed
<a name="creating-snowflake-target-node-permissions"></a>

 AWS Glue Studio jobs using Snowflake data sources require additional permissions. For more information on how to add permissions to ETL jobs, see [Review IAM permissions needed for ETL jobs](https://docs.aws.amazon.com/glue/latest/ug/setting-up.html#getting-started-min-privs-job). 

 `SNOWFLAKE` AWS Glue connections use an AWS Secrets Manager secret to provide credential information. Your job and data preview roles in AWS Glue Studio must have permission to read this secret.

## Adding a Snowflake data target
<a name="creating-snowflake-target-node-add"></a>

**To create a Snowflake target node:**

1.  Choose an existing Snowflake table as the target, or enter a new table name. 

1.  When you use the **Data target - Snowflake** target node, you can choose from the following options: 
   +  **APPEND** – If a table already exists, dump all the new data into the table as an insert. If the table doesn't exist, create it and then insert all new data. 
   +  **MERGE** – AWS Glue will update or append data to your target table based on the conditions you specify. 

      Choose options: 
     + **Choose keys and simple actions** – choose the columns to be used as matching keys between the source data and your target data set. 

       Specify the following options when matched:
       + Update record in your target data set with data from source.
       + Delete record in your target data set.

       Specify the following options when not matched:
       + Insert source data as a new row into your target data set.
       + Do nothing.
     + **Enter custom MERGE statement** – You can then choose **Validate Merge statement** to verify that the statement is valid or invalid.
   +  **TRUNCATE** – If a table already exists, truncate the table data by first clearing the contents of the target table. If truncate is successful, then insert all data. If the table doesn't exist, create the table and insert all data. If truncate is not successful, the operation will fail. 
   +  **DROP** – If a table already exists, delete the table metadata and data. If deletion is successful, then insert all data. If the table doesn't exist, create the table and insert all data. If drop is not successful, the operation will fail. 

# Set up the Authorization Code flow for Snowflake
<a name="snowflake-setup-authorization-code-flow"></a>

To use OAuth authentication method, ensure the following setup is complete:
+ **Configure Snowflake OAuth for a custom client** by following the official Snowflake documentation: [Configure Snowflake OAuth for custom clients.](https://docs.snowflake.com/en/user-guide/oauth-custom) 
+ **Set the correct redirect URI** when creating the Snowflake security integration. For example: If you are creating the connection in the DUB (eu-west-1) region, your redirect URI should be: `https://eu-west-1.console.aws.amazon.com/gluestudio/oauth` 
+ After creating the security integration, retain the following information for use when creating the Glue connection: 
  + OAUTH\$1CLIENT\$1ID: This value should be provided as User Managed Client Application Client ID on the Glue connection creation page.
  + OAUTH\$1CLIENT\$1SECRET: This value should be stored in the AWS Secret used for the connection, under the key USER\$1MANAGED\$1CLIENT\$1APPLICATION\$1CLIENT\$1SECRET.

## Advanced options
<a name="creating-snowflake-connection-advanced-options"></a>

See [ Snowflake connections ](https://docs.aws.amazon.com//glue/latest/dg/aws-glue-programming-etl-connect-snowflake-home.html) in the AWS Glue developer guide. 