

# Adding an AWS Glue connection


 You can connect to data sources in AWS Glue for Spark programmatically. For more information, see [Connection types and options for ETL in AWS Glue for Spark](aws-glue-programming-etl-connect.md) 

You can also use the AWS Glue console to add, edit, delete, and test connections. For information about AWS Glue connections, see [Connecting to data](glue-connections.md).

**Topics**
+ [

# Connecting to Adobe Analytics
](connecting-to-adobe-analytics.md)
+ [

# Connecting to Adobe Marketo Engage
](connecting-to-data-adobe-marketo-engage.md)
+ [

# Connecting to Amazon Redshift in AWS Glue Studio
](connecting-to-data-redshift.md)
+ [

# Connecting to Asana
](connecting-to-asana.md)
+ [

# Connecting to Azure Cosmos DB in AWS Glue Studio
](connecting-to-data-azurecosmos.md)
+ [

# Connecting to Azure SQL in AWS Glue Studio
](connecting-to-data-azuresql.md)
+ [

# Connecting to Blackbaud Raiser's Edge NXT
](connecting-to-data-blackbaud.md)
+ [

# Connecting to CircleCI
](connecting-to-data-circleci.md)
+ [

# Connecting to Datadog
](connecting-to-datadog.md)
+ [

# Connecting to Docusign Monitor
](connecting-to-data-docusign-monitor.md)
+ [

# Connecting to Domo
](connecting-to-data-domo.md)
+ [

# Connecting to Dynatrace
](connecting-to-data-dynatrace.md)
+ [

# Connecting to Facebook Ads
](connecting-to-data-facebook-ads.md)
+ [

# Connecting to Facebook Page Insights
](connecting-to-data-facebook-page-insights.md)
+ [

# Connecting to Freshdesk
](connecting-to-data-freshdesk.md)
+ [

# Connecting to Freshsales
](connecting-to-data-freshsales.md)
+ [

# Connecting to Google Ads
](connecting-to-googleads.md)
+ [

# Connecting to Google Analytics 4
](connecting-to-googleanalytics.md)
+ [

# Connecting to Google BigQuery in AWS Glue Studio
](connecting-to-data-bigquery.md)
+ [

# Connecting to Google Search Console
](connecting-to-data-google-search-console.md)
+ [

# Connecting to Google Sheets
](connecting-to-googlesheets.md)
+ [

# Connecting to HubSpot
](connecting-to-data-hubspot.md)
+ [

# Connecting to Instagram Ads
](connecting-to-data-instagram-ads.md)
+ [

# Connecting to Intercom in AWS Glue Studio
](connecting-to-data-intercom.md)
+ [

# Connecting to Jira Cloud
](connecting-to-data-jira-cloud.md)
+ [

# Connecting to Kustomer
](connecting-to-data-kustomer.md)
+ [

# Connecting to LinkedIn
](connecting-to-linkedin.md)
+ [

# Connecting to Mailchimp
](connecting-to-mailchimp.md)
+ [

# Connecting to Microsoft Dynamics 365 CRM
](connecting-to-microsoft-dynamics-365.md)
+ [

# Connecting to Microsoft Teams
](connecting-to-microsoft-teams.md)
+ [

# Connecting to Mixpanel
](connecting-to-mixpanel.md)
+ [

# Connecting to Monday
](connecting-to-monday.md)
+ [

# Connecting to MongoDB in AWS Glue Studio
](connecting-to-data-mongodb.md)
+ [

# Connecting to Oracle NetSuite
](connecting-to-data-oracle-netsuite.md)
+ [

# Connecting to OpenSearch Service in AWS Glue Studio
](connecting-to-data-opensearch.md)
+ [

# Connecting to Okta
](connecting-to-okta.md)
+ [

# Connecting to PayPal
](connecting-to-data-paypal.md)
+ [

# Connecting to Pendo
](connecting-to-pendo.md)
+ [

# Connecting to Pipedrive
](connecting-to-pipedrive.md)
+ [

# Connecting to Productboard
](connecting-to-productboard.md)
+ [

# Connecting to QuickBooks
](connecting-to-data-quickbooks.md)
+ [

# Connecting to a REST API
](connecting-to-data-rest-api.md)
+ [

# Connecting to Salesforce
](connecting-to-data-salesforce.md)
+ [

# Connecting to Salesforce Marketing Cloud
](connecting-to-data-salesforce-marketing-cloud.md)
+ [

# Connecting to Salesforce Commerce Cloud
](connecting-to-salesforce-commerce-cloud.md)
+ [

# Connecting to Salesforce Marketing Cloud Account Engagement
](connecting-to-data-salesforce-marketing-cloud-account-engagement.md)
+ [

# Connecting to SAP HANA in AWS Glue Studio
](connecting-to-data-saphana.md)
+ [

# Connecting to SAP OData
](connecting-to-data-sap-odata.md)
+ [

# Connecting to SendGrid
](connecting-to-data-sendgrid.md)
+ [

# Connecting to ServiceNow
](connecting-to-data-servicenow.md)
+ [

# Connecting to Slack in AWS Glue Studio
](connecting-to-data-slack.md)
+ [

# Connecting to Smartsheet
](connecting-to-smartsheet.md)
+ [

# Connecting to Snapchat Ads in AWS Glue Studio
](connecting-to-data-snapchat-ads.md)
+ [

# Connecting to Snowflake in AWS Glue Studio
](connecting-to-data-snowflake.md)
+ [

# Connecting to Stripe in AWS Glue Studio
](connecting-to-data-stripe.md)
+ [

# Connecting to Teradata Vantage in AWS Glue Studio
](connecting-to-data-teradata.md)
+ [

# Connecting to Twilio
](connecting-to-data-twilio.md)
+ [

# Connecting to Vertica in AWS Glue Studio
](connecting-to-data-vertica.md)
+ [

# Connecting to WooCommerce
](connecting-to-data-woocommerce.md)
+ [

# Connecting to Zendesk
](connecting-to-data-zendesk.md)
+ [

# Connecting to Zoho CRM
](connecting-to-data-zoho-crm.md)
+ [

# Connecting to Zoom Meetings
](connecting-to-data-zoom-meetings.md)
+ [

# Adding a JDBC connection using your own JDBC drivers
](console-connections-jdbc-drivers.md)

# Connecting to Adobe Analytics
Connecting to Adobe Analytics

Adobe Analytics is a robust data analysis platform that collects data from multi-channel digital experiences that support the customer journey and provides tools for analyzing the data. It’s a platform commonly used by marketers and business analysts for business analytics purposes. If you're a Adobe Analytics user, you can connect AWS Glue to your Adobe Analytics account. Then, you can use Adobe Analytics as a data source in your ETL Jobs. Run these jobs to transfer data between Adobe Analytics and AWS services or other supported applications.

**Topics**
+ [

# AWS Glue support for Adobe Analytics
](adobe-analytics-support.md)
+ [

# Policies containing the API operations for creating and using connections
](adobeanalytics-configuring-iam-permissions.md)
+ [

# Configuring Adobe Analytics
](adobeanalytics-configuring.md)
+ [

# Configuring Adobe Analytics connections
](adobeanalytics-configuring-connections.md)
+ [

# Reading from Adobe Analytics entities
](adobeanalytics-reading-from-entities.md)
+ [

# Adobe Analytics connection options
](adobeanalytics-connection-options.md)
+ [

# Creating an Adobe Analytics account
](adobeanalytics-create-account.md)
+ [

# Limitations
](adobeanalytics-connector-limitations.md)

# AWS Glue support for Adobe Analytics


AWS Glue supports Adobe Analytics as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Adobe Analytics.

**Supported as a target?**  
No.

**Supported Adobe Analytics API versions**  
 v2.0 

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the preceding method, alternatively, use the following managed IAM policies:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring Adobe Analytics


Before you can use AWS Glue to transfer from Adobe Analytics, you must meet the following requirements:

## Minimum requirements

+ You have an Adobe Analytics account with email and password. For more information about creating an account, see [Creating a Adobe Analytics account](adobeanalytics-create-account.md). 
+  Your Adobe Analytics account is enabled for API access. API access is enabled by default for the Select, Prime, and Ultimate editions. 

If you meet these requirements, you’re ready to connect AWS Glue to your Adobe Analytics account. For typical connections, you don't need do anything else in Adobe Analytics.

# Configuring Adobe Analytics connections


 Adobe Analytics supports `AUTHORIZATION_CODE` grant type for `OAuth2`.

This grant type is considered “three-legged” `OAuth` as it relies on redirecting users to the third-party authorization server to authenticate the user. Users may opt to create their own connected app in Adobe Analytics and provide their own client ID and client secret when creating connections through the AWS Glue console. In this scenario, they will still be redirected to Adobe Analytics to login and authorize AWS Glue to access their resources. 

This grant type results in a refresh token and access token. The access token is short lived, and may be refreshed automatically without user interaction using the refresh token. 

 For public Adobe Analytics documentation on creating a connected app for AUTHORIZATION\$1CODE OAuth flow, see [ Adobe Analytics APIs ](https://adobedocs.github.io/analytics-2.0-apis/). 

To configure a Adobe Analytics connection:

1. In AWS Secrets Manager, create a secret with the following details: 

   For customer managed connected app – Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key. 
**Note**  
It is a must to create a secret per connection in AWS Glue.

1. In AWS Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Connection type**, select Adobe Analytics.

   1. Provide the `x_api_key, instanceUrl` of the Adobe Analytics you want to connect to.

   1. Select the IAM role for which AWS Glue can assume and has permissions for following actions: 

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. 

   1.  Select the network options if you want to use your network. 

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`. 

# Reading from Adobe Analytics entities


 **Prerequisites** 

An Adobe Analytics Object you would like to read from. Refer the supported entities table below to check the available entities. 

 **Supported entities** 


| Entity | Can be Filtered | Supports Limit | Supports Order By | Supports Select \$1 | Supports Partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Annotation | Yes | Yes | Yes | Yes | No | 
| Calculated Metrics | Yes | Yes | Yes | Yes | No | 
| Calculated Metrics Function | Yes | No | No | Yes | No | 
| Component Metadata Shares | Yes | Yes | No | Yes | No | 
| Date Ranges | Yes | Yes | No | Yes | No | 
| Dimensions | Yes | No | No | Yes | No | 
| Metrics | Yes | No | No | Yes | No | 
| Projects | Yes | No | No | Yes | No | 
| Reports Top Item | Yes | Yes | No | Yes | No | 
| Segments | Yes | Yes | Yes | Yes | No | 
| Usage Logs | Yes | Yes | No | Yes | No | 

 **Example** 

```
adobeAnalytics_read = glueContext.create_dynamic_frame.from_options( 
     connection_type="adobeanalytics", 
     connection_options={ 
        "connectionName": "connectionName", 
        "ENTITY_NAME": "annotation/ex*****", 
        "API_VERSION": "v2.0" 
 })
```

 **Adobe Analytics entity and field details** 
+ [Annotations](https://adobedocs.github.io/analytics-2.0-apis/#/Annotations)
+ [Calculated Metrics](https://adobedocs.github.io/analytics-2.0-apis/#/Calculated%20Metrics)
+ [Component Meta Data](https://adobedocs.github.io/analytics-2.0-apis/#/Component%20Meta%20Data)
+ [Date Ranges](https://adobedocs.github.io/analytics-2.0-apis/#/Date%20Ranges)
+ [Dimensions](https://adobedocs.github.io/analytics-2.0-apis/#/Dimensions)
+ [Metrics](https://adobedocs.github.io/analytics-2.0-apis/#/Metrics)
+ [Projects](https://adobedocs.github.io/analytics-2.0-apis/#/Projects)
+ [Reports](https://adobedocs.github.io/analytics-2.0-apis/#/Reports)
+ [Segments](https://adobedocs.github.io/analytics-2.0-apis/#/Segments)
+ [Users](https://adobedocs.github.io/analytics-2.0-apis/#/Users)
+ [Usage Logs](https://adobedocs.github.io/analytics-2.0-apis/#/Usage%20Logs)

# Adobe Analytics connection options


The following are connection options for Adobe Analytics:
+  `ENTITY_NAME`(String) – (Required) Used for Read/Write. The name of your Object in Adobe Analytics. 
+  `API_VERSION`(String) – (Required) Used for Read/Write. Adobe Analytics Rest API version you want to use. For example: v2.0. 
+  `X_API_KEY`(String) – (Required) Used for Read/Write. It is required to authenticate the developer or application making requests to the API. 
+  `SELECTED_FIELDS`(List<String>) – Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object. 
+  `FILTER_PREDICATE`(String) – Default: empty. Used for Read. It should be in the Spark SQL format. 
+  `QUERY`(String) – Default: empty. Used for Read. Full Spark SQL query. 

# Creating an Adobe Analytics account


1. Register for the exchange partner program, by accessing [Adobe Partner program](https://partners.adobe.com/exchangeprogram/creativecloud.html). 

1. Choose **Join Exchange Program**. 

1. Register or create an account using your corporate email address. 

1. From the suggestion box, select the appropriate company that has an Adobe Analytics product subscription. 

1. Ensure that the account gets registered with a valid organization (from the list available) that has an active Adobe Analytics subscription. 

1. After the company administration's approval, activate your account by clicking on the link in the approval email. 

**Verifying if the account that you created has access to Adobe Analytics service**

1. Log in to [Adobe Admin Console](https://adminconsole.adobe.com/).

1. Check the organization name at the top-right corner of the page to ensure that you have logged in to the correct company.

1. Select **Products** and verify if Adobe Analytics is available.
**Note**  
If no organization is available, or Adobe Analytics product is greyed out or is unavailable, it is likely that your account is not associated with an organization and/or has no active Adobe Analytics subscription. Contact your system admin to request access for this service for your account.

**Creating a project and `OAuth2.0` credentials**

1. Log in to Adobe Analytics account where you want the [ OAuth 2.0 app](https://developer.adobe.com/developer-console/docs/guides/services/services-add-api-oauth/) to be created.

1. Select **Project** and then **Create a new project**. 

1. To add a project, select **Add to project**, and then select **API**.

1. Select **Adobe Analytics API**.

1. Select **OAUTH** as user authentication.

1. Select **Web** as `OAUTH` and provide the redirect URI. 

   For redirect URI and its pattern, see the following:
   + `OAuth 2.0` default redirect URI – A default redirect URI is the URL of the page that Adobe will access during the authentication process. For example, `https://ap-southeast-2.console.aws.amazon.com/appflow/oauth` 
   + OAuth 2.0 Redirect URI pattern – A Redirect URI pattern is a URI path (or comma-separated list of paths) to which Adobe can redirect (if requested) when the login flow is complete. For example, `https://ap-southeast-2\\.console\\.aws\\.amazon\\.com`

1. Add the following scopes: 
   + `openid`
   + `read_organizations`
   + `additional_info.projectedProductContext`
   + `additional_info.job_function`

1. Choose **Save credential**.

1. After the app is created, copy the `Client ID` and `Client Secret` values to a text file.

# Limitations


The following are limitations for the Adobe Analytics connector:
+ Adobe Analytics doesn’t support field based and record-based partitioning. Field based partitioning is not supported as you cannot query fields that you partition. Record based partitioning cannot be supported as there is no provision to get ‘offset’ for pagination.
+ In the `Report Top Item` entity, the `startDate` and `endDate` query parameters are not functioning as expected. The response is not being filtered based on these parameters, which is causing issues with the filter and incremental flow for this entity. 
+ For the `Annotation`, `Calculated Metrics`, `Calculated Metrics Function`, `Date Ranges`, `Dimension`, `Metric`, `Project`, `Report Top Items`, and `Segment` entities, the `locale` query parameter specifies which language is to be used for localized sections of responses and does not filter the records. For example, `locale="ja_JP"` will display the data in Japanese. 
+ `Report Top Item` entity – filter on `dateRange` and `lookupNoneValues` fields are currently not working. 
+ `Segment` entity: with filter value `includeType=“templates”`, filters on other fields are not working. 
+ `Date Range` entity – filter on `curatedRsid` field is not working. 
+ `Metric entity` entity – filter on segmentable field with “false” value gives result for both true and false value.

# Connecting to Adobe Marketo Engage
Connecting to Adobe Marketo Engage

Adobe Marketo Engage is a marketing automation platform that enables marketers to manage personalized multi-channel programs and campaigns to prospects and customers.

**Topics**
+ [

# AWS Glue support for Adobe Marketo Engage
](adobe-marketo-engage-support.md)
+ [

# Policies containing the API operations for creating and using connections
](adobe-marketo-engage-configuring-iam-permissions.md)
+ [

# Configuring Adobe Marketo Engage
](adobe-marketo-engage-configuring.md)
+ [

# Configuring Adobe Marketo Engage connections
](adobe-marketo-engage-configuring-connections.md)
+ [

# Reading from Adobe Marketo Engage entities
](adobe-marketo-engage-reading-from-entities.md)
+ [

# Writing to Adobe Marketo Engage entities
](adobe-marketo-engage-writing-to-entities.md)
+ [

# Adobe Marketo Engage connection options
](adobe-marketo-engage-connection-options.md)
+ [

# Limitations and notes for Adobe Marketo Engage connector
](adobe-marketo-engage-connector-limitations.md)

# AWS Glue support for Adobe Marketo Engage


AWS Glue supports Adobe Marketo Engage as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Adobe Marketo Engage.

**Supported as a target?**  
Yes. You can use AWS Glue ETL jobs to write data to Adobe Marketo.

**Supported Adobe Marketo Engage API versions**  
The following Adobe Marketo Engage API versions are supported:
+ v1

For entity support per version specific, see Supported entities for Source.

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring Adobe Marketo Engage


Before you can use AWS Glue to transfer data from Adobe Marketo Engage, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have a Adobe Marketo Engage account with client credentials.
+ Your Adobe Marketo Engage account has API access with a valid license.

If you meet these requirements, you’re ready to connect AWS Glue to your Adobe Marketo Engage account. For typical connections, you don't need do anything else in Adobe Marketo Engage.

## Getting OAuth 2.0 credentials


To obtain API credentials to make authenticated calls to your instance, see [Adobe Marketo Rest API](https://experienceleague.adobe.com/en/docs/marketo-developer/marketo/rest/rest-api)

# Configuring Adobe Marketo Engage connections


Adobe Marketo Engage supports the CLIENT CREDENTIALS grant type for OAuth2.
+ This grant type is considered 2-legged OAuth 2.0 as it is used by clients to obtain an access token outside of the context of a user. AWS Glue is able to use the client ID and client secret to authenticate the Adobe Marketo Engage APIs which are provided by custom services that you define.
+ Each custom service is owned by an API-only user which has a set of roles and permissions which authorize the service to perform specific actions. An access token is associated with a single custom service.
+ This grant type results in an access token which is short lived, and may be renewed by calling an identity endpoint.
+ For public Adobe Marketo Engage documentation for OAuth 2.0 with client credentials, see [Authentication](https://developers.adobe-marketo-engage.com/rest-api/authentication/) in the Adobe Marketo Engage Developer Guide.

To configure a Adobe Marketo Engage connection:

1. In AWS Secrets Manager, create a secret with the following details:

   1. For the customer managed connected app, the Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key.

   1. Note: You must create a secret per connection in AWS Glue.

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Connection type**, select Adobe Marketo Engage.

   1. Provide the `INSTANCE_URL` of the Adobe Marketo Engage instance you want to connect to.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for the following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

# Reading from Adobe Marketo Engage entities


**Prerequisite**

An Adobe Marketo Engage object you would like to read from. You will need the object name such as leads or activities or customobjects. The following tables shows the supported entities.

**Supported entities for source (synchronous)**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| leads | Yes | Yes | No | Yes | No | 
| activities | Yes | Yes | No | Yes | No | 
| customobjects | Yes | Yes | No | Yes | No | 

**Supported entities for source (asynchronous)**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| leads | Yes | No | No | Yes | Yes | 
| activities | Yes | No | No | Yes | No | 
| customobjects | Yes | No | No | Yes | Yes | 

**Example**:

```
adobe-marketo-engage_read = glueContext.create_dynamic_frame.from_options(
    connection_type="adobe-marketo-engage",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "leads",
        "API_VERSION": "v2",
        "INSTANCE_URL": "https://539-t**-6**.mktorest.com"
    }
```

**Adobe Marketo Engage entity and field details**:

**Entities with static metadata**: 

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/adobe-marketo-engage-reading-from-entities.html)

**Entities with dynamic metadata**:

For the following entities, Adobe Marketo Engage provides endpoints to fetch metadata dynamically, so that operator support is captured at the datatype level for each entity.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/adobe-marketo-engage-reading-from-entities.html)

## Partitioning queries


You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an **inclusive** lower bound value of the chosen partition field.

  For the DateTime field, we accept the value in ISO format.

  Example of valid value:

  ```
  "2024-07-01T00:00:00.000Z"
  ```
+ `UPPER_BOUND`: an **exclusive** upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`: the number of partitions.

The following table describes the entity partitioning field support details:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/adobe-marketo-engage-reading-from-entities.html)

Example:

```
adobe-marketo-engage_read = glueContext.create_dynamic_frame.from_options(
    connection_type="adobe-marketo-engage",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "leads",
        "API_VERSION": "v1",
        "PARTITION_FIELD": "createdAt"
        "LOWER_BOUND": "2024-07-01T00:00:00.000Z"
        "UPPER_BOUND": "2024-07-02T00:00:00.000Z"
        "NUM_PARTITIONS": "10"
    }
```

# Writing to Adobe Marketo Engage entities


**Prerequisites**
+ An Adobe Marketo object you would like to write to. You will need the object name such as `leads` or `customobjects`.
+ The Adobe Marketo connector supports three write operations:
  + INSERT
  + UPSERT
  + UPDATE
+ For `UPSERT` and `UPDATE` write operations, you must provide the `ID_FIELD_NAMES` option to specify the ID field for the records. When working with the `leads` entity, use `email` as `ID_FIELD_NAMES` for `UPSERT` operations and `id` for `UPDATE` operations. For the `customobjects` entity, use `marketoGUID` as `ID_FIELD_NAMES` for both `UPDATE` and `UPSERT` operations.

**Supported entities for Destination (Synchronous)**


| Entity name | Will be supported as Destination Connector | Can be Inserted | Can be Updated | Can be Upserted | 
| --- | --- | --- | --- | --- | 
| leads | Yes | Yes(Bulk) | Yes(Bulk) | Yes(Bulk) | 
| customobjects | Yes | Yes(Bulk) | Yes(Bulk) | Yes(Bulk) | 

**Example**:

**INSERT Operation:**

```
marketo_write = glueContext.write_dynamic_frame.from_options(
    frame=frameToWrite,
    connection_type="marketo",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "leads",
        "API_VERSION": "v1",
        "WRITE_OPERATION": "INSERT"
    }
```

**UPDATE Operation:**

```
marketo_write = glueContext.write_dynamic_frame.from_options(
    frame=frameToWrite,
    connection_type="marketo",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "leads",
        "API_VERSION": "v1",
        "WRITE_OPERATION": "UPDATE",
        "ID_FIELD_NAMES": "id"
    }
```

**Note**  
For the `leads` and `customobjects` entities, Adobe Marketo provides endpoints to fetch metadata dynamically so the writable fields are identified from the Marketo API response.

# Adobe Marketo Engage connection options


The following are connection options for Adobe Marketo Engage:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in Adobe Marketo Engage.
+ `API_VERSION`(String) - (Required) Used for Read. Adobe Marketo Engage Rest API version you want to use. For example: v1.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query.
+ `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field.
+ `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field. 
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read.
+ `TRANSFER_MODE`(String) - Default: SYNC. Used for asynchronous read.
+ `WRITE_OPERATION`(String) - Default: INSERT. Used for write. Value should be INSERT, UPDATE, UPSERT.
+ `ID_FIELD_NAMES`(String) - Default : null. Required for UPDATE and UPSERT.

# Limitations and notes for Adobe Marketo Engage connector


The following are limitations or notes for the Adobe Marketo Engage connector:
+ 'sinceDatetime' and 'activityTypeId' are mandatory filter parameters for the Sync Activities entity.
+ Subscriptions are allocated 50,000 API calls per day (which resets daily at 12:00AM CST). Additional daily capacity may be purchased as part of a Adobe Marketo Engage subscription. For reference, see [Adobe Marketo Rest API](https://experienceleague.adobe.com/en/docs/marketo-developer/marketo/rest/rest-api).
+ The maximum time span for the date range filter (`createdAt` or `updatedAt`) is 31 days. For reference, see [Bulk Extract - Marketo Developers](https://experienceleague.adobe.com/en/docs/marketo-developer/marketo/rest/bulk-extract/bulk-extract).
+ Subscriptions are allocated a maximum of 10 bulk extract jobs in the queue at any given time. For reference, see [Bulk Extract - Marketo Developers](https://experienceleague.adobe.com/en/docs/marketo-developer/marketo/rest/bulk-extract/bulk-extract).
+ By default, extract jobs are limited to 500 MB per day (which resets daily at 12:00AM CST). Additional daily capacity may be purchased as part of a Adobe Marketo Engage subscription. For reference, see [Bulk Extract - Marketo Developers](https://experienceleague.adobe.com/en/docs/marketo-developer/marketo/rest/bulk-extract/bulk-extract).
+ The maximum number of concurrent export jobs is 2. For reference, see [Bulk Extract - Marketo Developers](https://experienceleague.adobe.com/en/docs/marketo-developer/marketo/rest/bulk-extract/bulk-extract).
+ The maximum number of queued export jobs (inclusive of currently exporting jobs) is 10. For reference, see [Bulk Extract - Marketo Developers](https://experienceleague.adobe.com/en/docs/marketo-developer/marketo/rest/bulk-extract/bulk-extract).
+ The maximum allowed file size is 1 GB to extract from a Bulk job.
+ Once an asynchronous job is created, the file retention period will be 7 days before expiring. For reference, see [Bulk Extract - Marketo Developers](https://experienceleague.adobe.com/en/docs/marketo-developer/marketo/rest/bulk-extract/bulk-extract).
+ `createdAt` or `updatedAt` are mandatory filter parameters for the Async Leads entity.
+ `createdAt` is a mandatory filter parameter for the Async Activities entity.
+ `updatedAt` is a mandatory filter parameter for the Async Custom Object entity.
+ When using AWS Glue SaaS connectors, users cannot identify which specific records failed during a write operation to destination SaaS platforms in cases of partial failures.
+ When using Sync write operations, any fields with null values in the input file will be automatically dropped and not sent to the SaaS platform.
+ You can create or update up to 300 records in a batch for Sync write.

For more information, see [Adobe Marketo Engage Integration Best Practices](https://experienceleague.adobe.com/en/docs/marketo-developer/marketo/rest/marketo-integration-best-practices) and [Bulk Extract](https://https://experienceleague.adobe.com/en/docs/marketo-developer/marketo/rest/bulk-extract/bulk-activity-extract) .

# Connecting to Amazon Redshift in AWS Glue Studio
Connecting to Amazon Redshift

**Note**  
 You can use AWS Glue for Spark to read from and write to tables in Amazon Redshift databases outside of AWS Glue Studio. To configure Amazon Redshift with AWS Glue jobs programatically, see [Redshift connections](aws-glue-programming-etl-connect-redshift-home.md). 

 AWS Glue provides built-in support for Amazon Redshift. AWS Glue Studio provides a visual interface to connect to Amazon Redshift, author data integration jobs, and run them on AWS Glue Studio serverless Spark runtime. 

**Topics**
+ [

# Creating an Amazon Redshift connection
](creating-redshift-connection.md)
+ [

# Creating a Amazon Redshift source node
](creating-redshift-source-node.md)
+ [

# Creating an Amazon Redshift target node
](creating-redshift-target-node.md)
+ [

# Advanced options
](creating-redshift-connection-advanced-options.md)

# Creating an Amazon Redshift connection


## Permissions needed


 Additional permissions are need to use Amazon Redshift clusters and Amazon Redshift serverless environments. For more information on how to add permissions to ETL jobs, see [Review IAM permissions needed for ETL jobs](https://docs.aws.amazon.com/glue/latest/ug/setting-up.html#getting-started-min-privs-job). 
+  redshift:DescribeClusters 
+  redshift-serverless:ListWorkgroups 
+  redshift-serverless:ListNamespaces 

## Overview


 When adding an Amazon Redshift connection, you can choose an existing Amazon Redshift connection or create a new connection when adding a **Data source - Redshift** node in AWS Glue Studio. 

 AWS Glue supports both Amazon Redshift clusters and Amazon Redshift serverless environments. When you create a connection, Amazon Redshift serverless environments display the **serverless** label next to the connection option. 

 For more information on how to create a Amazon Redshift connection, see [ Moving data to and from Amazon Redshift](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-redshift.html#aws-glue-programming-etl-redshift-using). 

# Creating a Amazon Redshift source node


## Permissions needed


 AWS Glue Studio jobs using Amazon Redshift data sources require additional permissions. For more information on how to add permissions to ETL jobs, see [Review IAM permissions needed for ETL jobs](https://docs.aws.amazon.com/glue/latest/ug/setting-up.html#getting-started-min-privs-job). 

 The following permissions are needed in order to use an Amazon Redshift connection. 
+  redshift-data:ListSchemas 
+  redshift-data:ListTables 
+  redshift-data:DescribeTable 
+  redshift-data:ExecuteStatement 
+  redshift-data:DescribeStatement 
+  redshift-data:GetStatementResult 

## Adding an Amazon Redshift data source


**To add a **Data Source – Amazon Redshift** node:**

1.  Choose the Amazon Redshift access type: 
   +  Direct data connection (recommended) – choose this option if you want to access your Amazon Redshift data directly. This is the recommended option and also the default. 
   +  Data Catalog tables – choose this option if you have Data Catalog tables that you want to use. 

1.  If you choose Direct data connection, choose the connection for your Amazon Redshift data source. This assumes that the connection already exists and you can select from existing connections. If you need to create a connection, choose **Create Redshift connection**. For more information, see [ Overview of using connectors and connections ](https://docs.aws.amazon.com/glue/latest/ug/connectors-chapter.html#using-connectors-overview). 

    Once you have chosen a connection, you can view the connection properties by clicking **View properties**. Information about the connection are visible, including URL, security groups, subnet, availability zone, description, and created (UTC) and last updated (UTC) timestamps. 

1.  Choose a Amazon Redshift source option: 
   +  **Choose a single table** – this is the table that contains the data you want to access from a single Amazon Redshift table. 
   +  **Enter custom query ** – allows you to access a dataset from multiple Amazon Redshift tables based on your custom query. 

1.  If you chose a single table, choose the Amazon Redshift schema. The list of available schema to choose from is determined by the selected table. 

    Or, choose **Enter custom query**. Choose this option to access a custom dataset from multiple Amazon Redshift tables. When you choose this option, enter the Amazon Redshift query. 

    When connecting to an Amazon Redshift serverless environment, add the following permission to the custom query: 

   ```
               GRANT SELECT ON ALL TABLES IN <schema> TO PUBLIC
   ```

    You can choose **Infer schema** to read the schema based on the query that you entered. You can also choose **Open Redshift query editor** to enter a Amazon Redshift query. For more information, see [ Querying a database using the query editor ](https://docs.aws.amazon.com/redshift/latest/mgmt/query-editor.html). 

1.  In **Performance and security**, choose the Amazon S3 staging directory and IAM role. 
   +  **Amazon S3 staging directory** – choose the Amazon S3 location for temporarily staging data. 
   +  **IAM role** – choose the IAM role that can write to the Amazon S3 location you selected. 

1.  In **Custom Redshift paramters - optional**, enter the parameter and value. 

# Creating an Amazon Redshift target node


## Permissions needed


 AWS Glue Studio jobs using Amazon Redshift data target require additional permissions. For more information on how to add permissions to ETL jobs, see [Review IAM permissions needed for ETL jobs](https://docs.aws.amazon.com/glue/latest/ug/setting-up.html#getting-started-min-privs-job). 

 The following permissions are needed in order to use an Amazon Redshift connection. 
+  redshift-data:ListSchemas 
+  redshift-data:ListTables 

## Adding an Amazon Redshift target node


**To create a a Amazon Redshift target node:**

1.  Choose an existing Amazon Redshift table as the target, or enter a new table name. 

1.  When you use the **Data target - Redshift** target node, you can choose from the following options: 
   +  **APPEND** – If a table already exists, dump all the new data into the table as an insert. If the table doesn't exist, create it and then insert all new data. 

      Additionally, check the box if you want to update (UPSERT) existing records in the target table. The table must exist first, otherwise the operation will fail. 
   +  **MERGE** – AWS Glue will update or append data to your target table based on the conditions you specify. 
**Note**  
 To use the merge action in AWS Glue, you must enable Amazon Redshift merge functionality. For instructions on how to enable merge for your Amazon Redshift instance, see [MERGE (preview) ](https://docs.aws.amazon.com/redshift/latest/dg/r_MERGE.html). 

      Choose options: 
     + **Choose keys and simple actions** – choose the columns to be used as matching keys between the source data and your target data set. 

       Specify the following options when matched:
       + Update record in your target data set with data from source.
       + Delete record in your target data set.

       Specify the following options when not matched:
       + Insert source data as a new row into your target data set.
       + Do nothing.
     + **Enter custom MERGE statement** – You can then choose **Validate Merge statement** to verify that the statement is valid or invalid.
   +  **TRUNCATE** – If a table already exists, truncate the table data by first clearing the contents of the target table. If truncate is successful, then insert all data. If the table doesn't exist, create the table and insert all data. If truncate is not successful, the operation will fail. 
   +  **DROP** – If a table already exists, delete the table metadata and data. If deletion is successful, then insert all data. If the table doesn't exist, create the table and insert all data. If drop is not successful, the operation will fail. 
   +  **CREATE** – Create a new table with the default name. If table name already exist, create a new table with a name postfix of `job_datetime` to the name for uniqueness. This will insert all the data into the new table. If the table exists, the final table name will have the postfix appended. If the table doesn’t exist, a table will be created. In either case, a new table will be created. 

# Advanced options


 See [ Using the Amazon Redshift Spark connector on AWS Glue](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-redshift.html#aws-glue-programming-etl-redshift-using). 

# Connecting to Asana
Connecting to Asana

Asana is a cloud-based team collaboration solution that helps teams organize, plan, and complete tasks and projects. If you're an Asana user, your account contains data about your workspaces, projects, tasks, teams, and more. You can transfer data from Asana to certain AWS services or other supported applications.

**Topics**
+ [

# AWS Glue support for Asana
](asana-support.md)
+ [

# Policies containing the API operations for creating and using connections
](asana-configuring-iam-permissions.md)
+ [

# Configuring Asana
](asana-configuring.md)
+ [

# Configuring Asana connections
](asana-configuring-connections.md)
+ [

# Reading from Asana entities
](asana-reading-from-entities.md)
+ [

# Asana connection options
](asana-connection-options.md)
+ [

# Creating an Asana account
](asana-create-account.md)
+ [

# Limitations
](asana-connector-limitations.md)

# AWS Glue support for Asana


AWS Glue supports Asana as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Asana.

**Supported as a target?**  
No.

**Supported Asana API versions**  
 1.0 

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the preceding method, alternatively, use the following managed IAM policies:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring Asana


Before you can use AWS Glue to transfer from Asana, you must meet the following requirements:

## Minimum requirements

+ You have an Asana account with email and password. For more information about creating an account, see [Creating a Asana account](asana-create-account.md). 
+ You must have AWS Account created with the service access to AWS Glue. 
+ Ensure you have created one of the following resources in your Asana account: 
  + A Developer App that supports `OAuth 2.0` authentication. For more instuctions instructions, see [OAuth](https://developers.asana.com/docs/oauth) in the Asana Developers documentation. Alternatively, see [Creating an Asana account](asana-create-account.md). 
  + A personal access token. For more information, see the Personal access token[https://developers.asana.com/docs/personal-access-token](https://developers.asana.com/docs/personal-access-token) in the Asana Developers documentation. 

If you meet these requirements, you’re ready to connect AWS Glue to your Adobe Analytics account. For typical connections, you don't need do anything else in Adobe Analytics.

# Configuring Asana connections


Asana supports the `AUTHORIZATION_CODE` grant type for `OAuth2`. 

This grant type is considered “three-legged” `OAuth` as it relies on redirecting users to the third-party authorization server to authenticate the user. Users may opt to create their own connected app in Asana and provide their own client ID and client secret when creating connections through the AWS Glue console. In this scenario, they will still be redirected to Asana to login and authorize AWS Glue to access their resources. 

This grant type results in a refresh token and access token. The access token is short lived, and may be refreshed automatically without user interaction using the refresh token. 

For public Asana documentation on creating a connected app for `AUTHORIZATION_CODE OAuth` flow, see [Asana APIs](https://developers.asana.com/docs/oauth) . 

To configure a Asana connection:

1. In AWS Secrets Manager, create a secret with the following details: 
   + For customer managed connected app – Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key. 
   + 
**Note**  
You must create a secret for the connection in AWS Glue.

1. In AWS Glue Studio, create a connection under **Data Connections** by following the steps below: 

   1. When selecting a **Connection type**, select Asana.

   1. Provide the Asana environment.

   1. Select the IAM role for which AWS Glue can assume and has permissions for following actions: 

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` that you want to use for this connection in AWS Glue to put the tokens. 

   1.  Select the network options if you want to use your network. 

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`. 

# Reading from Asana entities


 **Prerequisites** 

An Asana Object you would like to read from. Refer the supported entities table below to check the available entities. 

 **Supported entities for source** 


| Entity | Can be Filtered | Supports Limit | Supports Order By | Supports Select \$1 | Supports Partitioning | 
| --- | --- | --- | --- | --- | --- | 
|  Workspace  | No | Yes | No | Yes | No | 
| Tag | No | Yes | No | Yes | No | 
| User | No | Yes | No | Yes | No | 
|  Portfolio  | No | Yes | No | Yes | No | 
| Team | No | Yes | No | Yes | No | 
| Project | Yes | Yes | No | Yes | No | 
| Section | No | Yes | No | Yes | No | 
| Task  | Yes | No | No | Yes | Yes | 
| Goal | Yes | Yes | No | Yes | No | 
|  AuditLogEvent  | Yes | Yes | No | Yes | No | 
|  Status Update  | Yes | Yes | No | Yes | No | 
|  Custom Field  | No | Yes | No | Yes | No | 
|  Project Brief  | Yes | No | No | Yes | Yes | 

 **Example** 

```
read_read = glueContext.create_dynamic_frame.from_options(
    connection_type="Asana",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "task/workspace:xxxx",
        "API_VERSION": "1.0",
        "PARTITION_FIELD": "created_at",
        "LOWER_BOUND": "2024-02-05T14:09:30.115Z",
        "UPPER_BOUND": "2024-06-07T13:30:00.134Z",
        "NUM_PARTITIONS": "3"
    }
```

 **Asana entity and field details** 
+ [Workspace](https://developers.asana.com/docs/workspaces)
+ [Tag](https://developers.asana.com/docs/tags)
+ [User](https://developers.asana.com/docs/users)
+ [Portfolio](https://developers.asana.com/docs/portfolios)
+ [Team](https://developers.asana.com/docs/teams)
+ [Project](https://developers.asana.com/docs/get-all-projects-in-a-workspace)
+ [Section](https://developers.asana.com/docs/get-sections-in-a-project)
+ [Task](https://developers.asana.com/docs/search-tasks-in-a-workspace) 
+ [Goal](https://developers.asana.com/docs/get-goals)
+ [AuditLogEvent](https://developers.asana.com/docs/audit-log-api)
+ [Status Update](https://developers.asana.com/reference/status-updates)
+ [Custom Field](https://developers.asana.com/reference/custom-fields)
+ [Project Brief](https://developers.asana.com/reference/project-briefs)

 **Partitioning queries** 

Additional spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, `NUM_PARTITIONS` can be provided if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by spark tasks concurrently. 
+ `PARTITION_FIELD`: the name of the field to be used to partition query. 
+ `LOWER_BOUND`: an inclusive lower bound value of the chosen partition field. 

  For date, we accept the Spark date format used in Spark SQL queries. Example of valid values: `2024-06-07T13:30:00.134Z`. 
+ `UPPER_BOUND`: an exclusive upper bound value of the chosen partition field. 
+ `NUM_PARTITIONS`: number of partitions. 

 Entity-wise partitioning field support details are captured in the following table. 


| Entity Name | Partitioning Field | Data Type | 
| --- | --- | --- | 
| Task |  created\$1at  | DateTime | 
| Task |  modified\$1at  | DateTime | 

 **Example** 

```
read_read = glueContext.create_dynamic_frame.from_options(
    connection_type="Asana",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "task/workspace:xxxx",
        "API_VERSION": "1.0",
        "PARTITION_FIELD": "created_at",
        "LOWER_BOUND": "2024-02-05T14:09:30.115Z",
        "UPPER_BOUND": "2024-06-07T13:30:00.134Z",
        "NUM_PARTITIONS": "3"
    }
```

# Asana connection options


The following are connection options for Asana:
+  `ENTITY_NAME`(String) – (Required) Used for Read/Write. The name of your Object in Asana. 
+  `API_VERSION`(String) – (Required) Used for Read/Write. Asana Rest API version you want to use. For example: 1.0. 
+  `SELECTED_FIELDS`(List<String>) – Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object. 
+  `FILTER_PREDICATE`(String) – Default: empty. Used for Read. It should be in the Spark SQL format. 
+  `QUERY`(String) – Default: empty. Used for Read. Full Spark SQL query. 
+ `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query. 
+  `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field. 
+  `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field. 
+  `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read. 

# Creating an Asana account


1. Sign up for an [Asana Account](https://asana.com/create-account) and choose **Sign Up**.

1. After logging in, you will be redirected to the [Account Setup](https://app.asana.com/0/account_setup) page. Complete the following steps:
   + Review the account setup form.
   + Fill in all the relevant details to create your Asana account.
   + Double-check the information for accuracy.

1. Choose **Create Account** or **Submit** (the exact button text may vary) to finalize your account setup.

**Creating the App in Asana for `OAuth2.0`**

1. Log in to Asana account using your [Asana Customer Credentials](https://app.asana.com/-/login). 

1. Choose your user profile icon in the top-right corner and select **My Settings** from the dropdown menu.

1. Select the **Apps** tab and then select **Manage Developer Apps**.

1. Select **Create new app** and enter the relevant details. 

1. Choose **Create Apps**.

1. On the **My Apps** page: 

   1. Select **OAuth** and in the **App Credentials** section, make a note of your Client ID and Client Secret.

   1. In the **Redirect URLs** section, add the necessary redirect URL(s).
**Note**  
Enter the Redirect URI using this format: `https://{aws-region-code}.console.aws.amazon.com/gluestudio/oauth`. Example: For the US East (N. Virginia), use: `https://us-east-1.console.aws.amazon.com/gluestudio/oauth`

**Creating the App in Asana for `PAT` Token**

1. Log in to Asana account using your [Asana Customer Credentials](https://app.asana.com/-/login). 

1. Choose on your user profile icon in the top-right corner and select **My Profile Settings** from the dropdown menu.

1. Select the **Apps** tab and then select **Service accounts**.

1. Select **Create new app** and enter the relevant details. 

1. Choose **Add service account**.

1. The next page displays your token, copy your token and store it securely. 
**Important**  
This token will only be displayed once. Ensure you copy it and store it securely. 

# Limitations


The following are limitations for the Asana connector:
+ Service Accounts in Enterprise Domains can only access audit log API endpoints. Authentication with a Service Account's personal access token is required to access these endpoints.
+ The Goal entity can only be accessed for user accounts with Premium plan or above.
+ `Audit Log Event Entity` – In the connector, `start_at` and `end_at` fields are combined into a single field "start\$1end\$1at" to support filtering and incremental transfer.
+ Partitioning cannot be supported for the `Date` field, even though it supports greater-than-or-equal-to and less-than-or-equal-to operators. Scenario: Created a job with `partitionField` as `due_on` (datatype: date), `lowerBound` as `2019-09-14`, `upperBound` as `2019-09-16`, and `numPartition` as `2`. The filter part of the endpoint URL is created as follows:
  + partition1: due\$1on.before=2019-09-14&due\$1on.after=2019-09-14
  + partition2: due\$1on.before=2019-09-15&due\$1on.after=2019-09-15 Output:
  + In partition1, we get data with due\$1date as 2019-09-14 and 2019-09-15
  + In partition2, we get the same data with due\$1date as 2019-09-15 (which was in partition1) along with other data, causing data duplication.
+ Filtering and partitioning cannot be supported on the same field as a bad request error is thrown from the SaaS end.
+ The Task entity requires a minimum of 1 field in filter criteria. There is a limitation with Asana where pagination is not identified without sorting the records based on a time-based field. Hence, the Created\$1at field is used along with pagination to distinguish the next set of records. The Created\$1at field is marked as mandatory in the filter, with a default value of 2000-01-01T00:00:00Z if not provided. For more information about Pagination, see [Tasks in a workspace](https://developers.asana.com/reference/searchtasksforworkspace).

# Connecting to Azure Cosmos DB in AWS Glue Studio
Connecting to Azure Cosmos DB

 AWS Glue provides built-in support for Azure Cosmos DB. AWS Glue Studio provides a visual interface to connect to Azure Cosmos DB for NoSQL, author data integration jobs, and run them on the AWS Glue Studio serverless Spark runtime. 

**Topics**
+ [

# Creating a Azure Cosmos DB connection
](creating-azurecosmos-connection.md)
+ [

# Creating a Azure Cosmos DB source node
](creating-azurecosmos-source-node.md)
+ [

# Creating a Azure Cosmos DB target node
](creating-azurecosmos-target-node.md)
+ [

## Advanced options
](#creating-azurecosmos-connection-advanced-options)

# Creating a Azure Cosmos DB connection


**Prerequisites**:
+ In Azure, you will need to identify or generate an Azure Cosmos DB Key for use by AWS Glue, `cosmosKey`. For more information, see [Secure access to data in Azure Cosmos DB](https://learn.microsoft.com/en-us/azure/cosmos-db/secure-access-to-data?tabs=using-primary-key) in the Azure documentation.

**To configure a connection to Azure Cosmos DB:**

1. In AWS Secrets Manager, create a secret using your Azure Cosmos DB Key. To create a secret in Secrets Manager, follow the tutorial available in [ Create an AWS Secrets Manager secret ](https://docs.aws.amazon.com//secretsmanager/latest/userguide/create_secret.html) in the AWS Secrets Manager documentation. After creating the secret, keep the Secret name, *secretName* for the next step. 
   + When selecting **Key/value pairs**, create a pair for the key `spark.cosmos.accountKey` with the value *cosmosKey*.

1. In the AWS Glue console, create a connection by following the steps in [Adding an AWS Glue connection](console-connections.md). After creating the connection, keep the connection name, *connectionName*, for future use in AWS Glue. 
   + When selecting a **Connection type**, select Azure Cosmos DB.
   + When selecting an **AWS Secret**, provide *secretName*.

# Creating a Azure Cosmos DB source node


## Prerequisites needed

+ A AWS Glue Azure Cosmos DB connection, configured with an AWS Secrets Manager secret, as described in the previous section, [Creating a Azure Cosmos DB connection](creating-azurecosmos-connection.md).
+ Appropriate permissions on your job to read the secret used by the connection.
+ A Azure Cosmos DB for NoSQL container you would like to read from. You will need identification information for the container.

  An Azure Cosmos for NoSQL container is identified by its database and container. You must provide the database, *cosmosDBName*, and container, *cosmosContainerName*, names when connecting to the Azure Cosmos for NoSQL API.

## Adding a Azure Cosmos DB data source


**To add a **Data source – Azure Cosmos DB** node:**

1.  Choose the connection for your Azure Cosmos DB data source. Since you have created it, it should be available in the dropdown. If you need to create a connection, choose **Create Azure Cosmos DB connection**. For more information see the previous section, [Creating a Azure Cosmos DB connection](creating-azurecosmos-connection.md). 

    Once you have chosen a connection, you can view the connection properties by clicking **View properties**. 

1. Choose **Cosmos DB Database Name** – provide the name of the database you want to read from, *cosmosDBName*.

1. Choose **Azure Cosmos DB Container** – provide the name of the container you want to read from, *cosmosContainerName*.

1. Optionally, choose **Azure Cosmos DB Custom Query** – provide a SQL SELECT query to retrieve specific information from Azure Cosmos DB.

1.  In **Custom Azure Cosmos properties**, enter parameters and values as needed. 

# Creating a Azure Cosmos DB target node


## Prerequisites needed

+ A AWS Glue Azure Cosmos DB connection, configured with an AWS Secrets Manager secret, as described in the previous section, [Creating a Azure Cosmos DB connection](creating-azurecosmos-connection.md).
+ Appropriate permissions on your job to read the secret used by the connection.
+ A Azure Cosmos DB table you would like to write to. You will need identification information for the container. **You must create the container before calling the connection method.**

  An Azure Cosmos for NoSQL container is identified by its database and container. You must provide the database, *cosmosDBName*, and container, *cosmosContainerName*, names when connecting to the Azure Cosmos for NoSQL API.

## Adding a Azure Cosmos DB data target


**To add a **Data target – Azure Cosmos DB** node:**

1.  Choose the connection for your Azure Cosmos DB data source. Since you have created it, it should be available in the dropdown. If you need to create a connection, choose **Create Azure Cosmos DB connection**. For more information see the previous section, [Creating a Azure Cosmos DB connection](creating-azurecosmos-connection.md). 

    Once you have chosen a connection, you can view the connection properties by clicking **View properties**. 

1. Choose **Cosmos DB Database Name** – provide the name of the database you want to read from, *cosmosDBName*.

1. Choose **Azure Cosmos DB Container** – provide the name of the container you want to read from, *cosmosContainerName*.

1.  In **Custom Azure Cosmos properties**, enter parameters and values as needed. 

## Advanced options


You can provide advanced options when creating a Azure Cosmos DB node. These options are the same as those available when programming AWS Glue for Spark scripts.

See [Azure Cosmos DB connections](aws-glue-programming-etl-connect-azurecosmos-home.md). 

# Connecting to Azure SQL in AWS Glue Studio
Connecting to Azure SQL

 AWS Glue provides built-in support for Azure SQL. AWS Glue Studio provides a visual interface to connect to Azure SQL, author data integration jobs, and run them on the AWS Glue Studio serverless Spark runtime. 

**Topics**
+ [

# Creating a Azure SQL connection
](creating-azuresql-connection.md)
+ [

# Creating a Azure SQL source node
](creating-azuresql-source-node.md)
+ [

# Creating a Azure SQL target node
](creating-azuresql-target-node.md)
+ [

## Advanced options
](#creating-azuresql-connection-advanced-options)

# Creating a Azure SQL connection


To connect to Azure SQL from AWS Glue, you will need to create and store your Azure SQL credentials in a AWS Secrets Manager secret, then associate that secret with a Azure SQL AWS Glue connection.

**To configure a connection to Azure SQL:**

1. In AWS Secrets Manager, create a secret using your Azure SQL credentials. To create a secret in Secrets Manager, follow the tutorial available in [ Create an AWS Secrets Manager secret ](https://docs.aws.amazon.com//secretsmanager/latest/userguide/create_secret.html) in the AWS Secrets Manager documentation. After creating the secret, keep the Secret name, *secretName* for the next step. 
   + When selecting **Key/value pairs**, create a pair for the key `user` with the value *azuresqlUsername*.
   + When selecting **Key/value pairs**, create a pair for the key `password` with the value *azuresqlPassword*.

1. In the AWS Glue console, create a connection by following the steps in [Adding an AWS Glue connection](console-connections.md). After creating the connection, keep the connection name, *connectionName*, for future use in AWS Glue. 
   + When selecting a **Connection type**, select Azure SQL.
   + When providing **Azure SQL URL**, provide a JDBC endpoint URL.

      The URL must be in the following format: `jdbc:sqlserver://databaseServerName:databasePort;databaseName=azuresqlDBname;`.

     AWS Glue requires the following URL properties: 
     + `databaseName` – A default database in Azure SQL to connect to.

     For more information about JDBC URLs for Azure SQL Managed Instances, see the [Microsoft documentation](https://learn.microsoft.com/en-us/sql/connect/jdbc/building-the-connection-url?view=azuresqldb-mi-current).
   + When selecting an **AWS Secret**, provide *secretName*.

# Creating a Azure SQL source node


## Prerequisites needed

+ A AWS Glue Azure SQL connection, configured with an AWS Secrets Manager secret, as described in the previous section, [Creating a Azure SQL connection](creating-azuresql-connection.md).
+ Appropriate permissions on your job to read the secret used by the connection.
+ A Azure SQL table you would like to read from, *tableName*.

  An Azure SQL table is identified by its database, schema and table name. You must provide the database name and table name when connecting to Azure SQL. You also must provide the schema if it is not the default, "public". Database is provided through a URL property in *connectionName* , schema and table name through the `dbtable`.

## Adding a Azure SQL data source


**To add a **Data source – Azure SQL** node:**

1.  Choose the connection for your Azure SQL data source. Since you have created it, it should be available in the dropdown. If you need to create a connection, choose **Create Azure SQL connection**. For more information see the previous section, [Creating a Azure SQL connection](creating-azuresql-connection.md). 

    Once you have chosen a connection, you can view the connection properties by clicking **View properties**. 

1.  Choose a **Azure SQL Source** option: 
   +  **Choose a single table** – access all data from a single table. 
   +  **Enter custom query ** – access a dataset from multiple tables based on your custom query. 

1.  If you chose a single table, enter *tableName*. 

    If you chose **Enter custom query**, enter a TransactSQL SELECT query. 

1.  In **Custom Azure SQL properties**, enter parameters and values as needed. 

# Creating a Azure SQL target node


## Prerequisites needed

+ A AWS Glue Azure SQL connection, configured with an AWS Secrets Manager secret, as described in the previous section, [Creating a Azure SQL connection](creating-azuresql-connection.md).
+ Appropriate permissions on your job to read the secret used by the connection.
+ A Azure SQL table you would like to write to, *tableName*.

  An Azure SQL table is identified by its database, schema and table name. You must provide the database name and table name when connecting to Azure SQL. You also must provide the schema if it is not the default, "public". Database is provided through a URL property in *connectionName* , schema and table name through the `dbtable`.

## Adding a Azure SQL data target


**To add a **Data target – Azure SQL** node:**

1.  Choose the connection for your Azure SQL data source. Since you have created it, it should be available in the dropdown. If you need to create a connection, choose **Create Azure SQL connection**. For more information see the previous section, [Creating a Azure SQL connection](creating-azuresql-connection.md). 

    Once you have chosen a connection, you can view the connection properties by clicking **View properties**. 

1. Configure **Table name** by providing *tableName*.

1.  In **Custom Azure SQL properties**, enter parameters and values as needed. 

## Advanced options


You can provide advanced options when creating a Azure SQL node. These options are the same as those available when programming AWS Glue for Spark scripts.

See [Azure SQL connections](aws-glue-programming-etl-connect-azuresql-home.md). 

# Connecting to Blackbaud Raiser's Edge NXT
Connecting to Blackbaud Raiser's Edge NXT

Blackbaud Raiser's Edge NXT is a comprehensive cloud-based fundraising and donor management software solution built specifically for nonprofits and the entire social good community. This connector is built on top of Blackbaud Raiser’s Edge NXT’s SKY API and provides operations to help manage entities found within the Raisers Edge NXT.

**Topics**
+ [

# AWS Glue support for Blackbaud Raiser's Edge NXT
](blackbaud-support.md)
+ [

# Policies containing the API operations for creating and using connections
](blackbaud-configuring-iam-permissions.md)
+ [

# Configuring Blackbaud Raiser's Edge NXT
](blackbaud-configuring.md)
+ [

# Configuring Blackbaud Raiser's Edge NXT connections
](blackbaud-configuring-connections.md)
+ [

# Reading from Blackbaud Raiser's Edge NXT entities
](blackbaud-reading-from-entities.md)
+ [

# Blackbaud Raiser's Edge NXT connection options
](blackbaud-connection-options.md)
+ [

# Blackbaud Raiser's Edge NXT limitations
](blackbaud-connection-limitations.md)

# AWS Glue support for Blackbaud Raiser's Edge NXT


AWS Glue supports Blackbaud Raiser's Edge NXT as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Blackbaud Raiser's Edge NXT.

**Supported as a target?**  
No.

**Supported Blackbaud Raiser's Edge NXT API versions**  
The following Blackbaud Raiser's Edge NXT API versions are supported:
+ v1

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring Blackbaud Raiser's Edge NXT


Before you can use AWS Glue to transfer data from Blackbaud Raiser's Edge NXT, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have a Blackbaud Raiser's Edge NXT account.
+ You have generated an Access Token in your Blackbaud Raiser's Edge NXT account with the appropriate read/write scope assigned to access the APIs. For more information, see [Authorization](https://developer.blackbaud.com/skyapi/docs/authorization).

If you meet these requirements, you’re ready to connect AWS Glue to your Blackbaud Raiser's Edge NXT account.

# Configuring Blackbaud Raiser's Edge NXT connections


Blackbaud Raiser's Edge NXT supports the AUTHORIZATION\$1CODE grant type for OAuth2.
+ This grant type is considered "three-legged" OAuth as it relies on redirecting users to a third-party authorization server to authenticate the user. It is used when creating connections via the AWS Glue console. The AWS Glue console will redirect the user to Blackbaud Raiser's Edge NXT where the user must login and allow AWS Glue the requested permissions to access their Blackbaud Raiser's Edge NXT instance.
+ Users may opt to create their own connected app in Blackbaud Raiser's Edge NXT and provide their own Client ID, Subscription Key, and Instance URL when creating connections through the AWS Glue console. In this scenario, they will still be redirected to Blackbaud Raiser's Edge NXT to login and authorize AWS Glue to access their resources.
+ This grant type results in a refresh token and access token. The access token is short lived, and may be refreshed automatically without user interaction using the refresh token.
+ For public Blackbaud Raiser’s Edge NXT documentation on creating a connected app for Authorization Code OAuth flow, see [Authorization](https://developer.blackbaud.com/skyapi/docs/authorization).

To configure a Blackbaud Raiser's Edge NXT connection:

1. In AWS Secrets Manager, create a secret with the following details:

   1. For the customer managed connected app, the Secret should contain the connected app API key with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key.

   1. Note: you must create a secret for your connections in AWS Glue.

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Data Source**, select Blackbaud Raiser's Edge NXT.

   1. Provide the `INSTANCE_URL` of the Blackbaud Raiser's Edge NXT account you want to connect to.

   1. Provide the user managed client application `clientId`.

   1. Provide the subscription key associated with your account.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

# Reading from Blackbaud Raiser's Edge NXT entities


**Prerequisite**

A Blackbaud Raiser's Edge NXT object you would like to read from. You will need the object name.

**Supported entities for source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Constituent Address | Yes | Yes | No | Yes | Yes | 
| Constituent Education | Yes | Yes | No | Yes | Yes | 
| Constituent Email address | Yes | Yes | No | Yes | Yes | 
| Constituent Phone | Yes | Yes | No | Yes | Yes | 
| Constituent Note | Yes | Yes | No | Yes | Yes | 
| Constituent Relationship | Yes | Yes | No | Yes | Yes | 
| Constituent Online presence | Yes | Yes | No | Yes | Yes | 
| Opportunity | Yes | Yes | No | Yes | Yes | 
| Appeal | Yes | Yes | No | Yes | Yes | 
| Campaign | Yes | Yes | No | Yes | Yes | 
| Fund | Yes | Yes | No | Yes | Yes | 
| Package | Yes | Yes | No | Yes | Yes | 
| Gift Batch | Yes | Yes | No | Yes | No | 
| Event Participant | Yes | Yes | Yes | Yes | Yes | 
| Constituent Fundraiser Assignment | No | No | No | Yes | No | 
| Gift | Yes | Yes | Yes | Yes | Yes | 
| Membership | Yes | Yes | No | Yes | Yes | 
| Action | Yes | Yes | No | Yes | No | 
| Constituent | Yes | Yes | Yes | Yes | Yes | 
| Constituent Goods | Yes | Yes | No | Yes | Yes | 
| Event | Yes | Yes | Yes | Yes | Yes | 
| Gift custom field | Yes | Yes | No | Yes | Yes | 

**Example**:

```
blackbaud_read = glueContext.create_dynamic_frame.from_options(
    connection_type="BLACKBAUD",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "API_VERSION": "v1",
        "SUBSCRIPTION_KEY": <Subscription key associated with one's developer account>
    }
```

## Blackbaud Raiser's Edge NXT entity and field details


For more information about the entities and field details see:
+ [Action](https://developer.blackbaud.com/skyapi/renxt/constituent/entities#Action)
+ [Constituent](https://developer.blackbaud.com/skyapi/renxt/constituent/entities#Constituent)
+ [Constituent Address](https://developer.blackbaud.com/skyapi/renxt/constituent/entities#Address)
+ [Constituent Membership](https://developer.blackbaud.com/skyapi/renxt/constituent/entities#Membership)
+ [Constituent Fundraiser Assignment](https://developer.blackbaud.com/skyapi/renxt/constituent/entities#FundraiserAssignment)
+ [Constituent Education](https://developer.blackbaud.com/skyapi/renxt/constituent/entities#Education)
+ [Constituent Email Address](https://developer.blackbaud.com/skyapi/renxt/constituent/entities#EmailAddress)
+ [Constituent Phone](https://developer.blackbaud.com/skyapi/renxt/constituent/entities#Phone)
+ [Constituent Note](https://developer.blackbaud.com/skyapi/renxt/constituent/entities#Note)
+ [Constituent Online Presence](https://developer.blackbaud.com/skyapi/renxt/constituent/entities#OnlinePresence)
+ [Constituent Relationship](https://developer.blackbaud.com/skyapi/renxt/constituent/entities#Relationship)
+ [Event](https://developer.blackbaud.com/skyapi/renxt/event/entities#Event)
+ [Event Participant](https://developer.blackbaud.com/skyapi/renxt/event/entities#Participant)
+ [Appeal](https://developer.blackbaud.com/skyapi/renxt/fundraising/entities#Appeal)
+ [Campaign](https://developer.blackbaud.com/skyapi/renxt/fundraising/entities#Campaign)
+ [Fund](https://developer.blackbaud.com/skyapi/renxt/fundraising/entities#Fund)
+ [Package](https://developer.blackbaud.com/skyapi/renxt/fundraising/entities#Package)
+ [Gift](https://developer.blackbaud.com/skyapi/renxt/gift/entities#Gift)
+ [Gift Custom Field](https://developer.blackbaud.com/skyapi/renxt/gift/entities#CustomField)
+ [Gift Batch](https://developer.blackbaud.com/skyapi/renxt/gift-batch/entities#GiftBatch)
+ [Opportunity](https://developer.blackbaud.com/skyapi/renxt/opportunity/entities#Opportunity)
+ [Constituent Codes](https://developer.sky.blackbaud.com/api#api=56b76470069a0509c8f1c5b3)

**Note**  
Struct and List data types are converted to String data type, and DateTime data type is converted to Timestamp in the response of the connectors.

## Partitioning queries


**Field-based partitioning**:

Blackbaud Raiser's Edge NXT doesn’t support field based or record based partitioning.

**Record-based partitioning**:

You can provide the additional Spark option `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With this parameter, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.

In record based partitioning, the total number of records present is queried from Blackbaud Raiser’s Edge NXT API, and it is divided by `NUM_PARTITIONS` number provided. The resulting number of records are then concurrently fetched by each sub-query.
+ `NUM_PARTITIONS`: the number of partitions.

Example:

```
blackbaud_read = glueContext.create_dynamic_frame.from_options(
    connection_type="BLACKBAUD",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "API_VERSION": "v1",
        "NUM_PARTITIONS": "2",
        "SUBSCRIPTION_KEY": <Subscription key associated with one's developer account>
    }
```

# Blackbaud Raiser's Edge NXT connection options


The following are connection options for Blackbaud Raiser's Edge NXT:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in Blackbaud Raiser's Edge NXT.
+ `API_VERSION`(String) - (Required) Used for Read. Blackbaud Raiser's Edge NXT Rest API version you want to use.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read. Example value: 10.
+ `SUBSCRIPTION_KEY`(String) - (Required) Default: empty. Used for Read. Subscription key associated with one's developer account.

# Blackbaud Raiser's Edge NXT limitations


The following are limitations or notes for Blackbaud Raiser's Edge NXT:
+ The SaaS only supports the `EQUAL_TO` operator, which returns results created or modified on or after the specified date. Additionally, the "id" field is a String data type. There is also no identification of non-nullable fields. Therefore, field-based partitioning is not supported.
+ Incremental pull is only supported by the `Event` entity with daily, monthly and weekly frequencies.
+ The Constituent Fundraiser Assignment entity returns a maximum of 20 records.
+ Record-based partitioning:
  + Not supported by the `Action`, `Constituent Fundraiser Assignment` or `Gift Batch` entities.
  + Record-based partitioning with the filter predicate is only supported by the `Event` and `Event Participant` entities. If a filter predicate is used with any other record-based supported entities, an exception will be thrown.
+ In the `Gift Custom Field` entity, the field 'value' must be used in conjunction with the field 'category', which otherwise leads to an unfiltered response. Thus, to compel the user to plug in the 'category' field while filtering with the 'value' field, an exception will be thrown if the aforementioned requirement has not been followed.
+ The `date_added` and `last_modified` fields for all applicable entities do not support any comparative operators. They only support the equal to operator. Also, there is no field that can be paired with the aforementioned fields to provide a range of records. Hence, these fields are only queryable and cannot support incremental transfer.
+ The `added_by` field in the `Gift Batch` entity will not be considered as filterable as it might not emit the correct results.
+ There is a latency of approximately 30 minutes for records to be retrieved via the `/GET Gift List` endpoint upon insertion of data in the `Gift` entity.
+ Support for incremental transfer has been dropped for the Gift entity due to limitations from the data source's end. 
+ There exists a 10 minute latency for the status field in the Opportunity entity.
+ The `Fundraiser Assignment` entity has `Constituent` as the dependent entity. The connector loads at most 5,000 IDs to choose from, to avoid the response size exceeding the maximum allowed payload size.

# Connecting to CircleCI
Connecting to CircleCI

CircleCI is a continuous integration and continuous delivery platform. Your CircleCI account contains data about your projects, pipelines, workflows, and more. If you're a CircleCI user, you can connect AWS Glue to your CircleCI account. Then, you can use CircleCI as a data source in your ETL jobs. Run these jobs to transfer data between CircleCI and AWS services or other supported applications.

**Topics**
+ [

# AWS Glue support for CircleCI
](circleci-support.md)
+ [

# Policies containing the API operations for creating and using connections
](circleci-configuring-iam-permissions.md)
+ [

# Configuring CircleCI
](circleci-configuring.md)
+ [

# Configuring CircleCI connections
](circleci-configuring-connections.md)
+ [

# Reading from CircleCI entities
](circleci-reading-from-entities.md)
+ [

# CircleCI connection options
](circleci-connection-options.md)
+ [

# CircleCI limitations
](circleci-connection-limitations.md)

# AWS Glue support for CircleCI


AWS Glue supports CircleCI as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from CircleCI.

**Supported as a target?**  
No.

**Supported CircleCI API versions**  
The following CircleCI API versions are supported:
+ v2

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring CircleCI


Before you can use AWS Glue to transfer data from CircleCI, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have an account with CircleCI that contains the data that you want to transfer. 
+ In the user settings for your account, you've created a personal API token. For more information, see [Creating a personal API token](https://circleci.com/docs/managing-api-tokens/#creating-a-personal-api-token).
+ You provide the personal API token to AWS Glue while creating the connection.

If you meet these requirements, you’re ready to connect AWS Glue to your CircleCI account.

# Configuring CircleCI connections


CircleCI supports custom authentication.

To configure a CircleCI connection:

1. In AWS Secrets Manager, create a secret with the following details:

   1. For the customer managed connected app, the Secret should contain the connected app API key with `Circle-Token` as key.

   1. Note: you must create a secret for your connections in AWS Glue.

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Data Source**, select CircleCI.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

# Reading from CircleCI entities


**Prerequisite**

A CircleCI object you would like to read from. You will need the object name.

**Supported entities for source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Context | Yes | No | No | Yes | No | 
| Organization Summary Metric | Yes | No | No | Yes | No | 
| Pipeline | No | No | No | Yes | No | 
| Pipeline Workflow | Yes | No | No | Yes | No | 
| Project Branch | Yes | No | No | Yes | No | 
| Project Flaky Test | No | No | No | Yes | No | 
| Project Summary Metric | Yes | No | No | Yes | No | 
| Schedule | No | No | No | Yes | No | 
| Workflow Job Timeseries | Yes | No | No | Yes | No | 
| Workflow Metric And Trend | Yes | No | No | Yes | No | 
| Workflow Recent Run | Yes | No | No | Yes | No | 
| Workflow Summary Metric | Yes | No | No | Yes | No | 
| Workflow Test Metric | Yes | No | No | Yes | No | 

**Example**:

```
circleci_read = glueContext.create_dynamic_frame.from_options(
    connection_type="circleci",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "context/e7ea2945-dccb-4205-b673-8391fe1b3a4c",
        "API_VERSION": "v2"
    }
```

## CircleCI entity and field details


For more information about the entities and field details see:
+ [Contexts](https://circleci.com/docs/api/v2/#operation/listContexts)
+ [Project Summary Metrics](https://circleci.com/docs/api/v2/#operation/getProjectWorkflowsPageData)
+ [Workflow Job Timeseries](https://circleci.com/docs/api/v2/#operation/getJobTimeseries)
+ [Organization Summary Metrics](https://circleci.com/docs/api/v2/#operation/getOrgSummaryData)
+ [Project Branches](https://circleci.com/docs/api/v2/#operation/getAllInsightsBranches)
+ [Project Flaky Tests](https://circleci.com/docs/api/v2/#operation/getFlakyTests)
+ [Workflow Recent Runs](https://circleci.com/docs/api/v2/#operation/getProjectWorkflowRuns)
+ [Workflow Summary Metrics](https://circleci.com/docs/api/v2/#operation/getProjectWorkflowMetrics)
+ [Workflow Metrics and Trends](https://circleci.com/docs/api/v2/#operation/getWorkflowSummary)
+ [Workflow Test Metrics](https://circleci.com/docs/api/v2/#operation/getProjectWorkflowTestMetrics)
+ [Pipelines](https://circleci.com/docs/api/v2/#operation/listPipelinesForProject)
+ [Pipeline Workflows](https://circleci.com/docs/api/v2/#operation/listWorkflowsByPipelineId)
+ [Schedules](https://circleci.com/docs/api/v2/#operation/listSchedulesForProject)

Entities with static metadata:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/circleci-reading-from-entities.html)

**Note**  
Struct and List data types are converted to String data type in the response of the connector.

**Partitioning queries**

CircleCI doesn’t support field-based or record-based partitioning.

# CircleCI connection options


The following are connection options for CircleCI:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in CircleCI.
+ `API_VERSION`(String) - (Required) Used for Read. CircleCI Rest API version you want to use.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.

# CircleCI limitations


The following are limitations or notes for CircleCI:
+ CircleCI does not support either field based or record based partitioning.
+ Filter fields containing '-' (hyphen) will work only if they are wrapped within backticks. For example: `workflow-name` = "abc"
+ The GitLab VCS type cannot be supported as there is no programmatic way to retrieve the 'Project ID' required for the GitLab VCS entity path.

# Connecting to Datadog
Connecting to Datadog

Datadog is a monitoring and analytics platform for cloud-scale applications, including infrastructure, applications, services, and tools.

**Topics**
+ [

# AWS Glue support for Datadog
](datadog-support.md)
+ [

# Policies containing the API operations for creating and using connections
](datadog-configuring-iam-permissions.md)
+ [

# Configuring Datadog
](datadog-configuring.md)
+ [

# Configuring Datadog connections
](datadog-configuring-connections.md)
+ [

# Reading from Datadog entities
](datadog-reading-from-entities.md)
+ [

# Datadog connection options
](datadog-connection-options.md)
+ [

# Creating a Datadog account
](datadog-create-account.md)
+ [

# Limitations
](datadog-connector-limitations.md)

# AWS Glue support for Datadog


AWS Glue supports Datadog as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Datadog.

**Supported as a target?**  
No.

**Supported Datadog API versions**  
 
+ v1
+ v2

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the preceding method, alternatively, use the following managed IAM policies:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring Datadog


Before you can use AWS Glue to transfer from Datadog, you must meet the following requirements:

## Minimum requirements

+ You have a Datadog account with DD-API-KEY and DD-APPLICATION-KEY. For more information about creating an account, see [Creating a Datadog account](datadog-create-account.md). 
+  Your Datadog account has API access with valid license.

   

Datadog supports the following six URLs. All Datadog API clients are configured by default to consume Datadog US1 site APIs. If you are on the Datadog EU site, you must select https://api.datadoghq.eu URL with the `DD-API-KEY` and `DD-APPLICATION-KEY` of the Datadog EU site to access the APIs. Similarly, for other sites, you should select the respective URLs with the `DD-API-KEY and DD-APPLICATION-KEY` of the respective site. 
+ US1 API URL — [https://api.datadoghq.com](https://api.datadoghq.com)https://api.datadoghq.com
+ EU API URL — [https://api.datadoghq.eu ](https://api.datadoghq.eu)
+ US3 API URL — [https://api.us3.datadoghq.com](https://api.us3.datadoghq.com) 
+ US5 API URL — [https://api.us5.datadoghq.com](https://api.us5.datadoghq.com)
+ S1-FED API URL — [https://api.ddog-gov.com](https://api.ddog-gov.com)
+ Japan API URL — [https://api.ap1.datadoghq.com](https://api.ap1.datadoghq.com)

If you meet these requirements, you’re ready to connect AWS Glue to your Datadog account.

# Configuring Datadog connections


Datadog supports custom authentication. Following are the steps to configure Datadog connection:

To configure a Datadog connection:

1. In AWS Secrets Manager, create a secret with the following details: 

   For customer managed connected app – Secret should contain the connected app Consumer Secret with `API_KEY` and `APPLICATION_KEY` as keys. 
**Note**  
It is a must to create a secret per connection in AWS Glue.

1. In AWS Glue Studio, create a connection under **Data Connections** by following the steps below: 

   1. When selecting a **Connection type**, select Datadog.

   1. Provide the `Instance_Url` of the Datadog you want to connect to.

   1. Select the IAM role for which AWS Glue can assume and has permissions for following actions: 

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. 

   1.  Select the network options if you want to use your network. 

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`. 

# Reading from Datadog entities


 **Prerequisites** 

A Datadog Object you would like to read from. Refer the supported entities table below to check the available entities. 

 **Supported entities** 


| Entity | Can be Filtered | Supports Limit | Supports Order By | Supports Select \$1 | Supports Partitioning | 
| --- | --- | --- | --- | --- | --- | 
|  Metrics Timeseries  | Yes | No | No | Yes | No | 
|  Log Queries  | Yes | Yes | Yes | Yes | No | 

 **Example** 

```
Datadog_read = glueContext.create_dynamic_frame.from_options(
    connection_type="datadog",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "log-queries",
        "API_VERSION": "v2",
        "INSTANCE_URL": "https://api.datadoghq.com",
        "FILTER_PREDICATE": "from = `2023-10-03T09:00:26Z`"
    }
```

 **Datadog entity and field details** 

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/datadog-reading-from-entities.html)

# Datadog connection options


The following are connection options for Datadog:
+  `ENTITY_NAME`(String) – (Required) Used for Read/Write. The name of your Object in Datadog.
+  `API_VERSION`(String) – (Required) Used for Read/Write. Datadog Rest API version you want to use. `v1` version supports `metrics-timeseries` entity whereas, `v2` version supports `log-queries` entity.
+  `INSTANCE_URL`(String) – (Required) Used for Read. Datadog instance URL. Datadog instance URL varies per region. 
+  `SELECTED_FIELDS`(List<String>) – Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object. 
+  `FILTER_PREDICATE`(String) – Default: empty. Used for Read. It should be in the Spark SQL format. 
+  `QUERY`(String) – Default: empty. Used for Read. Full Spark SQL query. 

# Creating a Datadog account


1. Go to [https://www.datadoghq.com/](https://api.datadoghq.com). 

1. Choose **GET STARTED FREE**.

1. Enter the required information and sign up. 

1. Install the **Datadog Agent Installer** as suggested. 

1. Ensure that the account gets registered with a valid organization (from the list available) that has an active Datadog subscription. 

1. After logging in to your Datadog account, hover over your username in the top-right corner to view the **Keys** details:

   1. To get your API key, choose **API Keys**.

   1. To get your application key, choose **Application Keys**.

# Limitations


The following are limitations for the Datadog connector:
+ Datadog doesn’t support either field based or record based partitioning.
+ `from` is mandatory filter parameter for `Log Queries` entity.
+ `from_to_date` and `query` are mandatory filter parameters for `Metrics Timeseries` entity.

# Connecting to Docusign Monitor
Connecting to Docusign Monitor

Docusign Monitor helps organizations protect their agreements with round-the-clock activity tracking. The Monitor API delivers this activity tracking information directly to existing security stacks or data visualization tools—enabling teams to detect unauthorized activity, investigate incidents, and quickly respond to verified threats. It also provides the flexibility, security teams need to customize dashboards and alerts to meet specific business needs.

**Topics**
+ [

# AWS Glue support for Docusign Monitor
](docusign-monitor-support.md)
+ [

# Policies containing the API operations for creating and using connections
](docusign-monitor-configuring-iam-permissions.md)
+ [

# Configuring Docusign Monitor
](docusign-monitor-configuring.md)
+ [

# Configuring Docusign Monitor connections
](docusign-monitor-configuring-connections.md)
+ [

# Reading from Docusign Monitor entities
](docusign-monitor-reading-from-entities.md)
+ [

# Docusign Monitor connection options
](docusign-monitor-connection-options.md)
+ [

# Docusign Monitor limitations
](docusign-monitor-connection-limitations.md)

# AWS Glue support for Docusign Monitor


AWS Glue supports Docusign Monitor as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Docusign Monitor.

**Supported as a target?**  
No.

**Supported Docusign Monitor API versions**  
The following Docusign Monitor API versions are supported:
+ v2.0

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring Docusign Monitor


Before you can use AWS Glue to transfer data from Docusign Monitor to supported destinations, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have an Docusign account where you use the Docusign Software product in Docusign Monitor.
+ In the developer console for your Docusign account, you've created an OAuth 2.0 integration app for AWS Glue.

  This app provides the client credentials that AWS Glue uses to access your data securely when it makes authenticated calls to your account. For more information, see [OAuth 2.0](https://developers.docusign.com/platform/webhooks/connect/validation-and-security/oauth-connect/) in the Docusign Monitor documentation.

If you meet these requirements, you’re ready to connect AWS Glue to your Docusign Monitor account.

# Configuring Docusign Monitor connections


Docusign Monitor supports the AUTHORIZATION\$1CODE grant type.
+ This grant type is considered three-legged OAuth as it relies on redirecting users to the third-party authorization server to authenticate the user. It is used when creating connections via the AWS Glue console.
+ Users may opt to create their own connected app in Docusign Monitor and provide their own client ID and client secret when creating connections through the AWS Glue console. In this scenario, they will still be redirected to Docusign Monitor to login and authorize AWS Glue to access their resources.
+ This grant type results in a refresh token and an access token. The access token is short lived, and may be refreshed automatically without user interaction using the refresh token.
+ For public Docusign Monitor documentation on creating a connected app for the Authorization Code OAuth flow, see [OAuth for Docusign Connect](https://developers.docusign.com/platform/webhooks/connect/validation-and-security/oauth-connect/).

To configure a Docusign Monitor connection:

1. In AWS Secrets Manager, create a secret with the following details:

   1. For the customer managed connected app, the Secret should contain the connected app API key with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key.

   1. Note: you must create a secret for your connections in AWS Glue.

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. Under **Connections**, choose **Create connection**.

   1. When selecting a **Data Source**, select Docusign Monitor.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Provide the **User Managed Client Application ClientId** of the Docusign Monitor app.

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

# Reading from Docusign Monitor entities


**Prerequisite**

A Docusign Monitor object you would like to read from.

**Supported entities for source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Monitoring Data | Yes | Yes | No | Yes | No | 

**Example**:

```
docusignmonitor_read = glueContext.create_dynamic_frame.from_options(
    connection_type="docusign_monitor",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "monitoring-data",
        "API_VERSION": "v2.0"
    }
```

## Docusign Monitor entity and field details


Entities with static metadata:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/docusign-monitor-reading-from-entities.html)

**Partitioning queries**

Docusign Monitor doesn’t support either field-based or record-based partitioning.

# Docusign Monitor connection options


The following are connection options for Docusign Monitor:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in Docusign Monitor.
+ `API_VERSION`(String) - (Required) Used for Read. Docusign Monitor Rest API version you want to use.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.

# Docusign Monitor limitations


The following are limitations or notes for Docusign Monitor:
+ When a filter is applied using the `cursor` field, the API retrieves records for the next seven days starting from the specified date.
+ If no filter is provided, the API retrieves records for the previous seven days from the current date of the API request.
+ Docusign Monitor does not support either field-based or record-based partitioning.
+ Docusign Monitor does not support the Order By feature.

# Connecting to Domo
Connecting to Domo

Domo is a cloud-based dash-boarding tool. With Domo’s enterprise application platform, the foundation you need to extend Domo is in place, so you can build your custom solutions faster.

**Topics**
+ [

# AWS Glue support for Domo
](domo-support.md)
+ [

# Policies containing the API operations for creating and using connections
](domo-configuring-iam-permissions.md)
+ [

# Configuring Domo
](domo-configuring.md)
+ [

# Configuring Domo connections
](domo-configuring-connections.md)
+ [

# Reading from Domo entities
](domo-reading-from-entities.md)
+ [

# Domo connection options
](domo-connection-options.md)
+ [

# Domo limitations
](domo-connection-limitations.md)

# AWS Glue support for Domo


AWS Glue supports Domo as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Domo.

**Supported as a target?**  
No.

**Supported Domo API versions**  
The following Domo API versions are supported:
+ v1

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring Domo


Before you can use AWS Glue to transfer data from Domo to supported destinations, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have a Domo account enabled for API access.
+ You have an app under your Domo developer account that provides the client credentials that AWS Glue uses to access your data securely when it makes authenticated calls to your account. For more information, see [Creating a Domo developer app](#domo-configuring-creating-developer-app).

If you meet these requirements, you’re ready to connect AWS Glue to your Domo account.

## Creating a Domo developer app


To get the Client ID and Client Secret you create a developer account.

1. Go to the [Domo developer login page.](https://developer.domo.com/manage-clients)

1. Choose **Login**.

1. Provide the domain name and click **Continue**.

1. Hover on **My Account** and choose **New Client**.

1. Provide the Name and Description and select the scope ("data") and choose **Create**.

1. Retrieve the generated **Client Id** and **Client Secret** from the new client created.

# Configuring Domo connections


Domo supports the CLIENT\$1CREDENTIALS grant type for OAuth2.
+ This grant type is considered two-legged OAuth as only the client application authenticates itself to the server, with no involvement to the user.
+ Users may opt to create their own connected app in Domo and provide their own client ID and client secret when creating connections through the AWS Glue console.
+ For public Domo documentation on creating a connected app for the Authorization Code OAuth flow, see [OAuth Authentication](https://developer.domo.com/portal/1845fc11bbe5d-api-authentication).

To configure a Domo connection:

1. In AWS Secrets Manager, create a secret with the following details:

   1. For the customer managed connected app, the Secret should contain the connected app access token, `client_id`, and `client_secret`.

   1. Note: you must create a secret for your connections in AWS Glue.

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. Under **Connections**, choose **Create connection**.

   1. When selecting a **Data Source**, select Domo.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

# Reading from Domo entities


**Prerequisite**

A Domo object you would like to read from. You will need the object name such as Data Set or Data Permission Policies. The following table shows the supported entities.

**Supported entities for source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Data Set | Yes | Yes | Yes | Yes | Yes | 
| Data Permission Policies | No | No | No | Yes | No | 

**Example**:

```
Domo_read = glueContext.create_dynamic_frame.from_options(
    connection_type="domo",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "dataset",
        "API_VERSION": "v1"
    }
```

## Domo entity and field details


Entities with static metadata:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/domo-reading-from-entities.html)

For the following entity, Domo provides endpoints to fetch metadata dynamically, so that operator support is captured at the datatype level for the entity.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/domo-reading-from-entities.html)

## Partitioning queries


**Field-based partitioning**

You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an **inclusive** lower bound value of the chosen partition field.

  For the DateTime field, we accept the value in ISO format.

  Example of valid value:

  ```
  "2023-01-15T11:18:39.205Z"
  ```

  For the Date field, we accept the value in ISO format.

  Example of valid value:

  ```
  "2023-01-15"
  ```
+ `UPPER_BOUND`: an **exclusive** upper bound value of the chosen partition field.

  Example of valid value:

  ```
  "2023-02-15T11:18:39.205Z"
  ```
+ `NUM_PARTITIONS`: the number of partitions.

Entity-wise partitioning field support details are captured in the following table:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/domo-reading-from-entities.html)

Example:

```
Domo_read = glueContext.create_dynamic_frame.from_options(
    connection_type="domo",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "dataset",
        "API_VERSION": "v1",
        "PARTITION_FIELD": "permissionTime"
        "LOWER_BOUND": "2023-01-15T11:18:39.205Z"
        "UPPER_BOUND": "2023-02-15T11:18:39.205Z"
        "NUM_PARTITIONS": "2"
    }
```

**Record-based partitioning**

You can provide the additional Spark option `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With this parameter, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.

In record based partitioning, the total number of records present is queried from Domo, and it is divided by the `NUM_PARTITIONS` number provided. The resulting number of records are then concurrently fetched by each sub-query.

Example:

```
Domo_read = glueContext.create_dynamic_frame.from_options(
    connection_type="domo",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "dataset",
        "API_VERSION": "v1",
        "NUM_PARTITIONS": "2"
    }
```

# Domo connection options


The following are connection options for Domo:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in Domo.
+ `API_VERSION`(String) - (Required) Used for Read. Domo Rest API version you want to use.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `PARTITION_FIELD`(String) - Used for read. Field to be used to partition query.
+ `LOWER_BOUND`(String)- Used for read. An inclusive lower bound value of the chosen partition field.
+ `UPPER_BOUND`(String) - Used for read. An exclusive upper bound value of the chosen partition field. 
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for read. Number of partitions for read.

# Domo limitations


The following are limitations or notes for Domo:
+ Due to an SDK limitation, filtration does not work as expected for the queryable fields that starts with '\$1' (for example: \$1BATCH\$1ID ).
+ Due to an API limitation, filtration works on the date prior to the date you provide. This also affects incremental pull. To overcome this limitation, select a date according to your time zone against UTC, for getting data for the required date.

# Connecting to Dynatrace
Connecting to Dynatrace

Dynatrace is a platform that offers analytics and automation for comprehensive observability and security. It specializes in monitoring and optimizing application performance, infrastructure, and user experience.

**Topics**
+ [

# AWS Glue support for Dynatrace
](dynatrace-support.md)
+ [

# Policies containing the API operations for creating and using connections
](dynatrace-configuring-iam-permissions.md)
+ [

# Configuring Dynatrace
](dynatrace-configuring.md)
+ [

# Configuring Dynatrace connections
](dynatrace-configuring-connections.md)
+ [

# Reading from Dynatrace entities
](dynatrace-reading-from-entities.md)
+ [

# Dynatrace connection options
](dynatrace-connection-options.md)
+ [

# Dynatrace limitations
](dynatrace-connection-limitations.md)

# AWS Glue support for Dynatrace


AWS Glue supports Dynatrace as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Dynatrace.

**Supported as a target?**  
No.

**Supported Dynatrace API versions**  
The following Dynatrace API versions are supported:
+ v2

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring Dynatrace


Before you can use AWS Glue to transfer data from Dynatrace, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have a Dynatrace account.
+ You have generated an Access Token in your Dynatrace account with the appropriate read/write scope assigned to access the APIs. For more information, see [Generate a token](https://docs.dynatrace.com/docs/discover-dynatrace/references/dynatrace-api/basics/dynatrace-api-authentication#create-token).

If you meet these requirements, you’re ready to connect AWS Glue to your Dynatrace account.

# Configuring Dynatrace connections


Dynatrace supports custom authentication.

To configure a Dynatrace connection:

1. In AWS Secrets Manager, create a secret with the following details:

   1. For the customer managed connected app, the Secret should contain the connected app API key with `apiToken` as key.

   1. Note: you must create a secret for your connections in AWS Glue.

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Data Source**, select Dynatrace.

   1. Provide the `INSTANCE_URL` of the Dynatrace account you want to connect to.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

# Reading from Dynatrace entities


**Prerequisite**

A Dynatrace object you would like to read from. You will need the object name such as "problem".

**Supported entities for source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Problem | Yes | Yes | Yes | Yes | No | 

**Example**:

```
Dynatrace_read = glueContext.create_dynamic_frame.from_options(
    connection_type="Dynatrace",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "problem",
        "API_VERSION": "v2",
        "INSTANCE_URL": "https://[instanceName].live.dynatrace.com"
    }
```

**Dynatrace entity and field details**:

Dynatrace provides endpoints to fetch metadata dynamically for supported entities. Accordingly, operator support is captured at the datatype level.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/dynatrace-reading-from-entities.html)

## Partitioning queries


Dynatrace doesn’t support field based or record based partitioning.

# Dynatrace connection options


The following are connection options for Dynatrace:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in Dynatrace.
+ `API_VERSION`(String) - (Required) Used for Read. Dynatrace Rest API version you want to use.
+ `INSTANCE_URL`(String) - Used for Read. A valid Dynatrace instance URL.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.

# Dynatrace limitations


The following are limitations or notes for Dynatrace:
+ Dynatrace doesn’t support either field based or record based partitioning.
+ For the Select All feature, if you provide the "field" in the filter then it will not allow records to be more then 10 per page.
+ The maximum page size supported is 500. If you select any of the [`evidenceDetails, impactAnalysis, recentComments`] fields while creating the flow then records per page will be defaulted to 10.

# Connecting to Facebook Ads
Connecting to Facebook Ads

Facebook Ads is a powerful digital advertising platform used by businesses of all sizes to reach their target audience and achieve various marketing objectives. The platform allows advertisers to create tailored ads that can be displayed across Facebook's family of apps and services, including Facebook and Messenger. With its advanced targeting capabilities, Facebook Ads enables businesses to reach specific demographics, interests, behaviors, and locations.

**Topics**
+ [

# AWS Glue support for Facebook Ads
](facebook-ads-support.md)
+ [

# Policies containing the API operations for creating and using connections
](facebook-ads-configuring-iam-permissions.md)
+ [

# Configuring Facebook Ads
](facebook-ads-configuring.md)
+ [

# Configuring Facebook Ads connections
](facebook-ads-configuring-connections.md)
+ [

# Reading from Facebook Ads entities
](facebook-ads-reading-from-entities.md)
+ [

# Facebook Ads connection options
](facebook-ads-connection-options.md)
+ [

# Limitations and notes for Facebook Ads connector
](facebook-ads-connector-limitations.md)

# AWS Glue support for Facebook Ads


AWS Glue supports Facebook Ads as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Facebook Ads.

**Supported as a target?**  
No.

**Supported Facebook Ads API versions**  
The following Facebook Ads API versions are supported:
+ v17.0
+ v18.0
+ v19.0
+ v20.0

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring Facebook Ads


Before you can use AWS Glue to transfer data from Facebook Ads, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ Facebook Standard accounts are accessed directly through Facebook.
+ User authentication is needed to generate the access token.
+ The Facebook Ads SDK connector will be implementing the *User Access Token OAuth* flow.
+ We are using OAuth2.0 to authenticate our API requests to Facebook Ads. This web-based authentication falls under the Multi-Factor Authentication (MFA) architecture, which is a superset of 2FA.
+ The user needs to grant permissions to access the end points. For accessing the user's data, endpoint authorization is handled through [permissions](https://developers.facebook.com/docs/permissions) and [features](https://developers.facebook.com/docs/features-reference).

## Getting OAuth 2.0 credentials


To obtain API credentials so that you can make authenticated calls to your instance, see [REST API](https://developers.facebook-ads.com/rest-api/) in the Facebook Ads Developer Guide.

# Configuring Facebook Ads connections


Facebook Ads supports the AUTHORIZATION\$1CODE grant type for OAuth2.
+ This grant type is considered three-legged OAuth as it relies on redirecting users to the third-party authorization server to authenticate the user. It is used when creating connections via the AWS Glue console.
+ Users may still opt to create their own connected app in Facebook Ads and provide their own client ID and client secret when creating connections through the AWS Glue console. In this scenario, they will still be redirected to Facebook Ads to login and authorize AWS Glue to access their resources.
+ This grant type results in an access token. An expiring system user token is valid for 60 days from a generated or refreshed date. To create continuity, the developer should refresh the access token within 60 days. Failing to do so results in forfeiting the access token and requires the developer obtain a new one to regain API access. See [Refresh Access Token](https://developers.facebook.com/docs/marketing-api/system-users/install-apps-and-generate-tokens/).
+ For public Facebook Ads documentation on creating a connected app for Authorization Code OAuth flow, see [Using OAuth 2.0 to Access Google APIs](https://developers.google.com/identity/protocols/oauth2) in the Google for Developers guide.

To configure a Facebook Ads connection:

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Connection type**, select Facebook Ads.

   1. Provide the `INSTANCE_URL` of the Facebook Ads instance you want to connect to.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for the following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

# Reading from Facebook Ads entities


**Prerequisite**

A Facebook Ads object you would like to read from. You will need the object name. The following tables shows the supported entities.

**Supported entities for source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Campaign | Yes | Yes | No | Yes | Yes | 
| Ad Set | Yes | Yes | No | Yes | Yes | 
| Ads | Yes | Yes | No | Yes | Yes | 
| Ad Creative | No | Yes | No | Yes | No | 
| Insights - Account | No | Yes | No | Yes | No | 
| Adaccounts | Yes | Yes | No | Yes | No | 
| Insights - Ad | Yes | Yes | No | Yes | Yes | 
| Insights - AdSet | Yes | Yes | No | Yes | Yes | 
| Insights - Campaign | Yes | Yes | No | Yes | Yes | 

**Example**:

```
FacebookAds_read = glueContext.create_dynamic_frame.from_options(
    connection_type="FacebookAds",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "API_VERSION": "v20.0"
    }
```

## Facebook Ads entity and field details


For more information about the entities and field details see:
+ [Ad Account](https://developers.facebook.com/docs/marketing-api/reference/ad-account)
+ [Campaign](https://developers.facebook.com/docs/marketing-api/reference/ad-campaign-group)
+ [Ad Set](https://developers.facebook.com/docs/marketing-api/reference/ad-campaign)
+ [Ad](https://developers.facebook.com/docs/marketing-api/reference/adgroup)
+ [Ad Creative](https://developers.facebook.com/docs/marketing-api/reference/ad-creative)
+ [Insight Ad Account](https://developers.facebook.com/docs/marketing-api/reference/ad-account/insights)
+ [Insights Ads](https://developers.facebook.com/docs/marketing-api/reference/adgroup/insights/)
+ [Insights AdSets](https://developers.facebook.com/docs/marketing-api/reference/ad-campaign/insights)
+ [Insights Campaigns](https://developers.facebook.com/docs/marketing-api/reference/ad-campaign-group/insights)

For more information, see [Marketing API](https://developers.facebook.com/docs/marketing-api/reference/v21.0).

**Note**  
Struct and List data types are converted to String data type in the response of the connectors.

## Partitioning queries


You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an **inclusive** lower bound value of the chosen partition field.

  For the DateTime field, we accept the Spark timestamp format used in Spark SQL queries.

  Example of valid value:

  ```
  "2022-01-01"
  ```
+ `UPPER_BOUND`: an **exclusive** upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`: the number of partitions.

Example:

```
FacebookADs_read = glueContext.create_dynamic_frame.from_options(
    connection_type="FacebookAds",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "API_VERSION": "v20.0",
        "PARTITION_FIELD": "created_time"
        "LOWER_BOUND": "2022-01-01"
        "UPPER_BOUND": "2024-01-02"
        "NUM_PARTITIONS": "10"
    }
```

# Facebook Ads connection options


The following are connection options for Facebook Ads:
+ `ENTITY_NAME`(String) - (Required) Used for read. The name of your object in Facebook Ads.
+ `API_VERSION`(String) - (Required) Used for read. Facebook Ads Rest API version you want to use. For example: v1.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for read. Full Spark SQL query.
+ `PARTITION_FIELD`(String) - Used for read. Field to be used to partition query.
+ `LOWER_BOUND`(String)- Used for read. An inclusive lower bound value of the chosen partition field.
+ `UPPER_BOUND`(String) - Used for read. An exclusive upper bound value of the chosen partition field. 
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for read. Number of partitions for read.
+ `TRANSFER_MODE`(String) - Default: SYNC. Used for asynchronous read.

# Limitations and notes for Facebook Ads connector


The following are limitations or notes for the Facebook Ads connector:
+ As Facebook Ads supports dynamic metadata, all fields can be queried. All the fields support filtration and records are fetched if the data is available, or else Facebook returns a Bad request (400) response with a proper error message.
+ An app's call count is the number of calls a user can make during a rolling one-hour window 200 multiplied by the number of users. For rate limit details, see [Rate Limits](https://developers.facebook.com/docs/graph-api/overview/rate-limiting/), and [Business Use Case Rate Limits](https://developers.facebook.com/docs/graph-api/overview/rate-limiting/#buc-rate-limits).

# Connecting to Facebook Page Insights
Connecting to Facebook Page Insights

Facebook Pages allow companies and other interest groups to create pages for the Facebook.com social network. Companies use these pages to share open hours, make announcements, and engage with customers online. If you are a Facebook Page Insights user, you can connect AWS Glue to your Facebook Page Insights account. You can use Facebook Page Insights as a data source in your ETL jobs. Run these jobs to transfer data from Facebook Page Insights to AWS services or other supported applications.

**Topics**
+ [

# AWS Glue support for Facebook Page Insights
](facebook-page-insights-support.md)
+ [

# Policies containing the API operations for creating and using connections
](facebook-page-insights-configuring-iam-permissions.md)
+ [

# Configuring Facebook Page Insights
](facebook-page-insights-configuring.md)
+ [

# Configuring Facebook Page Insights connections
](facebook-page-insights-configuring-connections.md)
+ [

# Reading from Facebook Page Insights entities
](facebook-page-insights-reading-from-entities.md)
+ [

# Facebook Page Insights connection options
](facebook-page-insights-connection-options.md)
+ [

# Limitations and notes for Facebook Page Insights connector
](facebook-page-insights-connector-limitations.md)

# AWS Glue support for Facebook Page Insights


AWS Glue supports Facebook Page Insights as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Facebook Page Insights.

**Supported as a target?**  
No.

**Supported Facebook Page Insights API versions**  
The following Facebook Page Insights API versions are supported:
+ v17
+ v18
+ v19
+ v20
+ v21

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring Facebook Page Insights


Before you can use AWS Glue to transfer data from Facebook Page Insights, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ Facebook Standard accounts are accessed directly through Facebook.
+ User authentication is needed to generate the access token.
+ The Facebook Page Insights connector implements the User Access Token OAuth flow.
+ The connector uses OAuth2.0 to authenticate our API requests to Facebook Page Insights. This falls under Multi-Factor Authentication (MFA) architecture, which is a superset of 2FA. It is web-based authentication.
+ User needs to grant permissions to access the endpoints. For accessing the user's data, endpoint authorization is handled through permissions and features.

# Configuring Facebook Page Insights connections


To configure a Facebook Page Insights connection:

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Connection type**, select Facebook Page Insights.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the authorization code URL.

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

# Reading from Facebook Page Insights entities


**Prerequisite**

A Facebook Page Insights object you would like to read from. You will need the object name.

**Supported entities for source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Page Content | Yes | No | Yes | Yes | Yes | 
| Page CTA Clicks | Yes | No | No | Yes | Yes | 
| Page Engagement | Yes | No | No | Yes | Yes | 
| Page Impressions | Yes | No | No | Yes | Yes | 
| Page Posts | Yes | No | No | Yes | Yes | 
| Page Post Engagement | No | No | No | Yes | No | 
| Page Post Reactions | No | No | No | Yes | No | 
| Page Reactions | Yes | No | No | Yes | Yes | 
| Stories | Yes | No | No | Yes | Yes | 
| Page User Demographics | Yes | No | No | Yes | Yes | 
| Page Video Views | Yes | No | No | Yes | Yes | 
| Page Views | Yes | No | No | Yes | Yes | 
| Page Video Posts | Yes | No | No | Yes | Yes | 
| Pages | No | Yes | No | Yes | No | 
| Feeds | Yes | Yes | No | Yes | Yes | 

**Example**:

```
facebookPageInsights_read = glueContext.create_dynamic_frame. from options(
    connection_type="facebookpageinsights",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "API_VERSION": "v21"
   }
```

**Facebook Page Insights field details**:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/facebook-page-insights-reading-from-entities.html)

## Partitioning queries


**Filter-based partitioning**:

You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an **inclusive** lower bound value of the chosen partition field.

  For the Datetime field, we accept the Spark timestamp format used in Spark SQL queries.

  Examples of valid value:

  ```
  "2024-09-30T01:01:01.000Z"
  ```
+ `UPPER_BOUND`: an **exclusive** upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`: the number of partitions.

Example:

```
facebookPageInsights_read = glueContext.create_dynamic_frame.from_options(
     connection_type="facebookpageinsights",
     connection_options={
         "connectionName": "connectionName",
         "ENTITY_NAME": "entityName",
         "API_VERSION": "v21",
         "PARTITION_FIELD": "created_Time"
         "LOWER_BOUND": "2024-10-27T07:00:00+0000"
         "UPPER_BOUND": "2024-10-27T07:00:00+0000"
         "NUM_PARTITIONS": "10"
     }
```

# Facebook Page Insights connection options


The following are connection options for Facebook Page Insights:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in Facebook Page Insights.
+ `API_VERSION`(String) - (Required) Used for Read. Facebook Page Insights Rest API version you want to use.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query.
+ `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field.
+ `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read.
+ `INSTANCE_URL`(String) - (Required) Used for Read. A valid Facebook Page Insights instance URL.

# Limitations and notes for Facebook Page Insights connector


The following are limitations or notes for the Facebook Page Insights connector:
+ Most metrics will update once every 24 hours.
+ Only the last two years of insights data is available.
+ Only 90 days of insights can be viewed at one time when using the `since` and `until` parameters.

# Connecting to Freshdesk
Connecting to Freshdesk

Freshdesk is a cloud-based customer support software that is both feature-rich and easy to use. With multiple support channels available, including live chat, email, phone, and social media, you can help customers through their preferred communication method. If you are a Freshdesk user, you can connect AWS Glue to your Freshdesk account. You can use Freshdesk as a data source in your ETL jobs. Run these jobs to transfer data from Freshdesk to AWS services or other supported applications.

**Topics**
+ [

# AWS Glue support for Freshdesk
](freshdesk-support.md)
+ [

# Policies containing the API operations for creating and using connections
](freshdesk-configuring-iam-permissions.md)
+ [

# Configuring Freshdesk
](freshdesk-configuring.md)
+ [

# Configuring Freshdesk connections
](freshdesk-configuring-connections.md)
+ [

# Reading from Freshdesk entities
](freshdesk-reading-from-entities.md)
+ [

# Freshdesk connection options
](freshdesk-connection-options.md)
+ [

# Limitations and notes for Freshdesk connector
](freshdesk-connector-limitations.md)

# AWS Glue support for Freshdesk


AWS Glue supports Freshdesk as follows:

**Supported as a source?**  
Yes – Sync and Async. You can use AWS Glue ETL jobs to query data from Freshdesk.

**Supported as a target?**  
No.

**Supported Freshdesk API versions**  
The following Freshdesk API versions are supported:
+ v2

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring Freshdesk


Before you can use AWS Glue to transfer data from Freshdesk, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ A Freshdesk account. You can choose from Free, Growth, Pro or Enterprise editions.
+ A Freshdesk user's API key.

# Configuring Freshdesk connections


Freshdesk supports custom authentication.

For public Freshdesk documentation on generating the required API keys for custom authorization, see [Freshdesk authentication](https://developer.freshdesk.com/api/#authentication).

The following are the steps to configure Freshdesk connection:
+ In AWS Secrets Manager, create a secret with the following details:
  + For customer managed connected app – the secret should contain the connected app API key with `apiKey` as key. Note that you must create a secret per connection in AWS Glue.
+ In the AWS Glue Studio, create a connection under **Data Connections** by following the steps below:
  + When selecting a **Data source**, select Freshdesk.
  + Provide the `INSTANCE_URL` of the Freshdesk instance you want to connect to.
  + Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

    ```
    {
      "Version":"2012-10-17",		 	 	 
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [
            "secretsmanager:DescribeSecret",
            "secretsmanager:GetSecretValue",
            "secretsmanager:PutSecretValue",
            "ec2:CreateNetworkInterface",
            "ec2:DescribeNetworkInterfaces",
            "ec2:DeleteNetworkInterface"
          ],
          "Resource": "*"
        }
      ]
    }
    ```

------
  + Select the `secretName` that you want to use for this connection in AWS Glue to put the tokens.
  + Select the network options if you want to use your network.
+ Grant the IAM role associated with your AWS Glue job permission to read `secretName`.
+ In your AWS Glue job configuration, provide `connectionName` as an **Additional network connection**.

# Reading from Freshdesk entities


**Prerequisite**

A Freshdesk object you would like to read from. You will need the object name.

**Supported entities for Sync source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Agents | Yes | Yes | No | Yes | Yes | 
| Business Hours | No | Yes | No | Yes | Yes | 
| Company | Yes | Yes | No | Yes | Yes | 
| Contacts | Yes | Yes | No | Yes | Yes | 
| Conversations | No | Yes | No | Yes | No | 
| Email Configs | No | Yes | No | Yes | No | 
| Email Inboxes | Yes | Yes | Yes | Yes | No | 
| Forum Categories | No | Yes | No | Yes | No | 
| Forums | No | Yes | No | Yes | No | 
| Groups | No | Yes | No | Yes | No | 
| Products | No | Yes | No | Yes | No | 
| Roles | No | Yes | No | Yes | No | 
| Satisfaction Ratings | Yes | Yes | No | Yes | No | 
| Skills | No | Yes | No | Yes | No | 
| Solutions | Yes | Yes | No | Yes | No | 
| Surveys | No | Yes | No | Yes | No | 
| Tickets | Yes | Yes | Yes | Yes | Yes | 
| Time Entries | Yes | Yes | No | Yes | No | 
| Topics | No | Yes | No | Yes | No | 
| Topic Comments | No | Yes | No | Yes | No | 

**Supported entities for Async source**:


| Entity | API version | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | --- | 
| Companies | v2 | No | No | No | No | No | 
| Contacts | v2 | No | No | No | No | No | 

**Example**:

```
freshdesk_read = glueContext.create_dynamic_frame.from_options(
    connection_type="freshdesk",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "API_VERSION": "v2"
    }
```

**Freshdesk entity and field details**:


| Entity | Field | 
| --- | --- | 
| Agents | https://developers.freshdesk.com/api/\$1list\$1all\$1agents | 
| Business-hours | https://developers.freshdesk.com/api/\$1list\$1all\$1business\$1hours | 
| Comments | https://developers.freshdesk.com/api/\$1comment\$1attributess | 
| Company | https://developers.freshdesk.com/api/\$1companies | 
| Contacts | https://developers.freshdesk.com/api/\$1list\$1all\$1contacts | 
| Conversations | https://developers.freshdesk.com/api/\$1list\$1all\$1ticket\$1notes | 
| Email-configs | https://developers.freshdesk.com/api/\$1list\$1all\$1email\$1configs | 
| Email-inboxes | https://developers.freshdesk.com/api/\$1list\$1all\$1email\$1mailboxes | 
| Forum-categories | https://developers.freshdesk.com/api/\$1category\$1attributes | 
| Forums | https://developers.freshdesk.com/api/\$1forum\$1attributes | 
| Groups | https://developers.freshdesk.com/api/\$1list\$1all\$1groups | 
| Products | https://developers.freshdesk.com/api/\$1list\$1all\$1products | 
| Roles | https://developers.freshdesk.com/api/\$1list\$1all\$1roles | 
| Satisfaction-rating | https://developers.freshdesk.com/api/\$1view\$1all\$1satisfaction\$1ratingss | 
| Skills | https://developers.freshdesk.com/api/\$1list\$1all\$1skills | 
| Solutions | https://developers.freshdesk.com/api/\$1solution\$1content | 
| Surveys | https://developers.freshdesk.com/api/\$1list\$1all\$1survey | 
| Tickets | https://developers.freshdesk.com/api/\$1list\$1all\$1tickets | 
| Time-entries | https://developers.freshdesk.com/api/\$1list\$1all\$1time\$1entries | 
| Topics | https://developers.freshdesk.com/api/\$1topic\$1attributes | 

## Partitioning queries


**Filter-based partitioning**:

You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an **inclusive** lower bound value of the chosen partition field.

  For the Datetime field, we accept the Spark timestamp format used in Spark SQL queries.

  Examples of valid value:

  ```
  "2024-09-30T01:01:01.000Z"
  ```
+ `UPPER_BOUND`: an **exclusive** upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`: the number of partitions.

Example:

```
freshDesk_read = glueContext.create_dynamic_frame.from_options(
     connection_type="freshdesk",
     connection_options={
         "connectionName": "connectionName",
         "ENTITY_NAME": "entityName",
         "API_VERSION": "v2",
         "PARTITION_FIELD": "Created_Time"
         "LOWER_BOUND": " 2024-10-27T23:16:08Z“
         "UPPER_BOUND": " 2024-10-27T23:16:08Z"
         "NUM_PARTITIONS": "10"
     }
```

# Freshdesk connection options


The following are connection options for Freshdesk:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in Freshdesk.
+ `API_VERSION`(String) - (Required) Used for Read. Freshdesk Rest API version you want to use.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query.
+ `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field.
+ `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read.
+ `INSTANCE_URL`(String) - (Required) Used for Read. A valid Freshdesk instance URL.
+ `TRANSFER_MODE`(String) - Used to indicate whether the type of processing like `SYNC` or `ASYNC` is set to `SYNC` by default. (Optional)

# Limitations and notes for Freshdesk connector


The following are limitations or notes for the Freshdesk connector:
+ The `Company`, `Contacts`, and `Tickets` entities with filtration have pagination limitations. They return only 30 records per page and the page value can be set up to a maximum of 10 (fetching a maximum of 300 records).
+ The `Tickets` entity does not fetch records older than 30 days.
+ The `Company`, `Contacts`, and `Tickets` entities support the 'Date' datatype in filtration. You should select the 'Daily' onward trigger frequencies for these three entities. Selecting 'Minutes' or 'Hourly' can lead to duplicate data. Also, while selecting these fields for filtration, only the date value should be selected, since it will only consider the date portion of the selected timestamp.
+ The number of API calls per minute is based on your plan. This limit is applied on an account wide basis irrespective of factors such as the number of agents or IP addresses used to make the calls. For all trial users, there is a default API limit of 50 calls/minute. For more details, refer to [Freshdesk](https://developer.freshdesk.com/api/#ratelimit)
+ For any entity, only one Export/Async Job is processed at a time. A new job will only be processed once the existing job has completed successfully or failed. For more details, refer to [Freshdesk](https://developers.freshdesk.com/api/#export_contact)
+ The following fields are supported for Sync API calls, but are not supported/allowed to be passed in Async API request body.
  + id
  + created\$1at
  + updated\$1at
  + updated\$1since
  + active
  + company\$1id
  + other\$1companies
  + avatar
  + view\$1all\$1tickets
  + deleted
  + other\$1emails
  + state
  + tag
  + tags

# Connecting to Freshsales
Connecting to Freshsales

Freshsales is an intuitive CRM that helps sales reps take the guesswork out of sales. With the built-in phone and email, tasks, appointments and notes, sales reps needn’t have to toggle between tabs to follow up on prospects. You can manage your deals better with the pipeline view and drive more deals to closure. If you are a Freshsales user, you can connect AWS Glue to your Freshsales account. You can use Freshsales as a data source in your ETL jobs. Run these jobs to transfer data from Freshsales to AWS services or other supported applications.

**Topics**
+ [

# AWS Glue support for Freshsales
](freshsales-support.md)
+ [

# Policies containing the API operations for creating and using connections
](freshsales-configuring-iam-permissions.md)
+ [

# Configuring Freshsales
](freshsales-configuring.md)
+ [

# Configuring Freshsales connections
](freshsales-configuring-connections.md)
+ [

# Reading from Freshsales entities
](freshsales-reading-from-entities.md)
+ [

# Freshsales connection options
](freshsales-connection-options.md)
+ [

# Freshsales limitations
](freshsales-connection-limitations.md)

# AWS Glue support for Freshsales


AWS Glue supports Freshsales as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Freshsales.

**Supported as a target?**  
No.

**Supported Freshsales API versions**  
The following Freshsales API versions are supported:
+ v1.0

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring Freshsales


Before you can use AWS Glue to transfer data from Freshsales, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have a Freshsales account.
+ You have a user API key.

If you meet these requirements, you’re ready to connect AWS Glue to your Freshsales account. For typical connections, you don't need do anything else in Freshsales.

# Configuring Freshsales connections


Freshsales supports custom authentication.

For public Freshsales documentation on generating the required API keys for custom authentication, see [Authentication](https://developer.freshsales.io/api/#authentication).

To configure a Freshsales connection:

1. In AWS Secrets Manager, create a secret with the following details:

   1. For the customer managed connected app, the Secret should contain the connected app API key with `apiSecretKey` as key. The Secret also needs to contain another key-value pair with `apiKey` as key and `token` as value.

   1. Note: you must create a secret for your connections in AWS Glue.

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Data Source**, select Freshsales.

   1. Provide the `INSTANCE_URL` of the Freshsales account you want to connect to.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

# Reading from Freshsales entities


**Prerequisite**

A Freshsales object you would like to read from. You will need the object name.

**Supported entities for source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Accounts | Yes | Yes | Yes | Yes | Yes | 
| Contacts | Yes | Yes | Yes | Yes | Yes | 

**Example**:

```
freshSales_read = glueContext.create_dynamic_frame.from_options(
     connection_type="freshsales",
     connection_options={
         "connectionName": "connectionName",
         "ENTITY_NAME": "entityName",
         "API_VERSION": "v1.0"
     }
```

**Freshsales entity and field details**:

Freshsales provides endpoints to fetch metadata dynamically for supported entities. Accordingly, operator support is captured at the datatype level.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/freshsales-reading-from-entities.html)

## Partitioning queries


**Filter-based partitioning**:

You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an **inclusive** lower bound value of the chosen partition field.

  For the Datetime field, we accept the value in ISO format.

  Examples of valid value:

  ```
  "2024-09-30T01:01:01.000Z"
  ```
+ `UPPER_BOUND`: an **exclusive** upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`: the number of partitions.

Example:

```
freshSales_read = glueContext.create_dynamic_frame.from_options(
     connection_type="freshsales",
     connection_options={
         "connectionName": "connectionName",
         "ENTITY_NAME": "entityName",
         "API_VERSION": "v1",
         "PARTITION_FIELD": "Created_Time"
         "LOWER_BOUND": " 2024-10-15T21:16:25Z"
         "UPPER_BOUND": " 2024-10-20T21:25:50Z"
         "NUM_PARTITIONS": "10"
     }
```

# Freshsales connection options


The following are connection options for Freshsales:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in Freshsales.
+ `API_VERSION`(String) - (Required) Used for Read. Freshsales Rest API version you want to use.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query.
+ `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field.
+ `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read.
+ `INSTANCE_URL`(String) - Used for Read. A valid Freshsales instance URL.

# Freshsales limitations


The following are limitations or notes for Freshsales:
+ In Freshsales, the API Rate limit is 1000 API requests per hour per account (see [Errors](https://developer.freshsales.io/api/#error). But this limit is extendable with the Enterprise subscription plan (see the [plan comparison](https://www.freshworks.com/crm/pricing-compare/)).

# Connecting to Google Ads


 The Google Ads API is the programmatic interface to Google Ads, used for managing large or complex Google Ads accounts and campaigns. If you're a Google Ads user, you can connect AWS Glue to your Google Ads account. Then, you can use Google Ads as a data source in your ETL jobs. Run these jobs to transfer data between Google Ads and AWS services or other supported applications. 

**Topics**
+ [

# AWS Glue support for Google Ads
](googleads-support.md)
+ [

# Policies containing the API operations for creating and using connections
](googleads-configuring-iam-permissions.md)
+ [

# Configuring Google Ads
](googleads-configuring.md)
+ [

# Configuring Google Ads connections
](googleads-configuring-connections.md)
+ [

# Reading from Google Ads entities
](googleads-reading-from-entities.md)
+ [

# Google Ads connection options
](googleads-connection-options.md)
+ [

# Creating a Google Ads account
](googleads-create-account.md)
+ [

# Limitations
](googleads-connector-limitations.md)

# AWS Glue support for Google Ads


AWS Glue supports Google Ads as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Google Ads.

**Supported as a target?**  
No.

**Supported Google Ads API versions**  
v18

# Policies containing the API operations for creating and using connections
IAM policies

 The following sample policy describes the required AWS permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

You can also use the following managed IAM policies to allow access:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring Google Ads


Before you can use AWS Glue to transfer from Google Ads, you must meet these requirements:

## Minimum requirements

+  You have a Google Ads account with Email and Password. For more information on creating an account, see [Creating a Google Ads account](googleads-create-account.md). 
+  Your Google Ads account is enabled for API access. All use of the Google Ads API is available at no additional cost. 
+  Your Google Ads account allows you to install connected apps. If you lack access to this functionality, contact your Google Ads administrator. 

 If you meet these requirements, you’re ready to connect AWS Glue to your Google Ads account. 

# Configuring Google Ads connections


 Google Ads supports `AUTHORIZATION_CODE` grant type for OAuth2. 

 This grant type is considered “three-legged” OAuth as it relies on redirecting users to the third party authorization server to authenticate the user. It is used when creating connections via the AWS Glue Console. The AWS Glue Console will redirect the user to Google Ads where the user must login and allow AWS Glue the requested permissions to access their Google Ads instance. 

 Users may opt to create their own connected app in Google Ads and provide their own client ID and client secret when creating connections through the AWS Glue Console. In this scenario, they will still be redirected to Google Ads to login and authorize AWS Glue to access their resources. 

 This grant type results in a refresh token and access token. The access token is short lived, and may be refreshed automatically without user interaction using the refresh token. 

 For more information, see [ public Google Ads documentation on creating a connected app for Authorization Code OAuth flow ](https://developers.google.com/workspace/guides/create-credentials). 

To configure a Google Ads connection:

1.  In AWS Secrets Manager, create a secret with the following details. It is required to create a secret for each connection in AWS Glue. 

   1.  For AuthorizationCode grant type: 
      +  For customer managed connected app – Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key. 

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below: 

   1. When selecting a **Connection type**, select Google Ads.

   1. Provide the `developer token` of the Google Ads you want to connect to.

   1. Provide the `MANAGER ID` of the Google Ads if you want to log in as manager.

   1.  Select the IAM role which AWS Glue can assume and has permissions for following actions: 

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1.  Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. 

   1.  Select the network options if you want to use your network. 

1.  Grant the IAM role associated with your AWS Glue job permission to read `secretName`. 

# Reading from Google Ads entities


 **Prerequisites** 
+  A Google Ads Object you would like to read from. Refer the supported entities table below to check the available entities. 

 **Supported entities** 


| Entity | Can be Filtered | Supports Limit | Supports Order By | Supports Select \$1 | Supports Partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Ad Group Ad | Yes | Yes | Yes | No | Yes | 
| Ad Group | Yes | Yes | Yes | No | Yes | 
| Campaign Budget | Yes | Yes | Yes | Yes | Yes | 
| Account Budget | Yes | No | Yes | Yes | No | 
| Campaign | Yes | Yes | Yes | Yes | Yes | 
| Account | Yes | No | Yes | No | No | 

 **Example** 

```
googleAds_read = glueContext.create_dynamic_frame.from_options(
    connection_type="googleads",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "campaign-3467***",
        "API_VERSION": "v16"
    }
```

 **Google Ads entity and field details** 


| Entity | Field | Data Type | Supported Operators | 
| --- | --- | --- | --- | 
| Account | resourceName | String | \$1=, = | 
| Account | callReportingEnabled | Boolean | \$1=, = | 
| Account | callConversionReportingEnabled | Boolean | \$1=, = | 
| Account | callConversionAction | String | \$1=, = | 
| Account | conversionTrackingId | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Account | crossAccountConversionTrackingId | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Account | payPerConversionEligibilityFailureReasons | List |  | 
| Account | id | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Account | currencyCode | String | \$1=, =, LIKE | 
| Account | timeZone | String | \$1=, =, LIKE | 
| Account | autoTaggingEnabled | Boolean | \$1=, = | 
| Account | hasPartnersBadge | Boolean | \$1=, = | 
| Account | manager | Boolean | \$1=, = | 
| Account | testAccount | Boolean | \$1=, = | 
| Account | date | Date | BETWEEN, =, <, >, <=, >= | 
| Account | costMicros | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Account | acceptedCustomerDataTerms | Boolean |  | 
| Account | conversionTrackingStatus | String | \$1=, =, LIKE | 
| Account | enhancedConversionsForLeadsEnabled | Boolean |  | 
| Account | googleAdsConversionCustomer | String |  | 
| Account | status | String | \$1=, = | 
| Account | allConversionsByConversionDate | Double | \$1=, =, <, > | 
| Account | allConversionsValueByConversionDate | Double | \$1=, =, <, > | 
| Account | conversionsByConversionDate | Double | \$1=, =, <, > | 
| Account | conversionsValueByConversionDate | Double | \$1=, =, <, > | 
| Account | valuePerAllConversionsByConversionDate | Double | \$1=, =, <, > | 
| Account | videoViews | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Account | clicks | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Account | invalidClicks | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Account | costPerAllConversions | Double | \$1=, =, <, > | 
| Account | costPerConversion | Double | \$1=, =, <, > | 
| Account | conversions | Double | \$1=, =, <, > | 
| Account | absoluteTopImpressionPercentage | Double | \$1=, =, <, > | 
| Account | impressions | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Account | topImpressionPercentage | Double | \$1=, =, <, > | 
| Account | averageCpc | Double | \$1=, =, <, > | 
| Account | activeViewMeasurableCostMicros | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Account | averageCost | Double | \$1=, =, <, > | 
| Account | ctr | Double | \$1=, =, <, > | 
| Account | activeViewCtr | Double | \$1=, =, <, > | 
| Account | searchImpressionShare | Double | \$1=, =, <, > | 
| Account | conversionAction | String | \$1=, = | 
| Account | conversionActionCategory | String | \$1=, = | 
| Account | conversionActionName | String | \$1=, =, LIKE | 
| Account Budget | resourceName | String | \$1=, = | 
| Account Budget | status | String | \$1=, = | 
| Account Budget | proposedEndTimeType | String | \$1=, = | 
| Account Budget | approvedEndTimeType | String | \$1=, = | 
| Account Budget | id | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Account Budget | billingSetup | String | \$1=, = | 
| Account Budget | name | String | \$1=, =, LIKE | 
| Account Budget | approvedStartDateTime |  DateTime | BETWEEN, =, <, >, <=, >= | 
| Account Budget | proposedSpendingLimitMicros | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Account Budget | approvedSpendingLimitMicros | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Account Budget | adjustedSpendingLimitMicros | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Account Budget | amountServedMicros | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Ad Group | resourceName | String | \$1=, =, LIKE | 
| Ad Group | status | String | \$1=, =, LIKE | 
| Ad Group | type | String | \$1=, =, LIKE | 
| Ad Group | id | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Ad Group | name | String | \$1=, =, LIKE | 
| Ad Group | campaign | String | \$1=, = | 
| Ad Group | cpcBidMicros | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Ad Group | targetCpaMicros | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Ad Group | cpmBidMicros | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Ad Group | cpvBidMicros | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Ad Group | targetCpmMicros | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Ad Group | effectiveTargetCpaMicros | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Ad Group | date | Date | BETWEEN, =, <, >, <=, >= | 
| Ad Group | costMicros | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Ad Group | useAudienceGrouped | Boolean | \$1=, = | 
| Ad Group | effectiveCpcBidMicros | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Ad Group | allConversionsByConversionDate | Double | \$1=, =, <, > | 
| Ad Group | allConversionsValueByConversionDate | Double | \$1=, =, <, > | 
| Ad Group | conversionsByConversionDate | Double | \$1=, =, <, > | 
| Ad Group | conversionsValueByConversionDate | Double | \$1=, =, <, > | 
| Ad Group | valuePerAllConversionsByConversionDate | Double | \$1=, =, <, > | 
| Ad Group | valuePerConversionsByConversionDate | Double | \$1=, =, <, > | 
| Ad Group | averageCost | Double | \$1=, =, <, > | 
| Ad Group | costPerAllConversions | Double | \$1=, =, <, > | 
| Ad Group | costPerConversion | Double | \$1=, =, <, > | 
| Ad Group | averagePageViews | Double | \$1=, =, <, > | 
| Ad Group | videoViews | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Ad Group | clicks | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Ad Group | allConversions | Double | \$1=, =, <, > | 
| Ad Group | averageCpc | Double | \$1=, =, <, > | 
| Ad Group | absoluteTopImpressionPercentage | Double | \$1=, =, <, > | 
| Ad Group | impressions | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Ad Group | topImpressionPercentage | Double | \$1=, =, <, > | 
| Ad Group | activeViewCtr | Double | \$1=, =, <, > | 
| Ad Group | ctr | Double | \$1=, =, <, > | 
| Ad Group | searchTopImpressionShare | Double | \$1=, =, <, > | 
| Ad Group | searchImpressionShare | Double | \$1=, =, <, > | 
| Ad Group | searchAbsoluteTopImpressionShare | Double | \$1=, =, <, > | 
| Ad Group | relativeCtr | Double | \$1=, =, <, > | 
| Ad Group | conversionAction | String | \$1=, = | 
| Ad Group | conversionActionCategory | String | \$1=, = | 
| Ad Group | conversionActionName | String | \$1=, =, LIKE | 
| Ad Group | updateMask | String |  | 
| Ad Group | create | Struct |  | 
| Ad Group | update | Struct |  | 
| Ad Group | primaryStatus | String | \$1=, = | 
| Ad Group | primaryStatusReasons | List |  | 
| Ad Group Ad | resourceName | String | \$1=, = | 
| Ad Group Ad | id | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Ad Group Ad | status | String | \$1=, = | 
| Ad Group Ad | labels | List |  | 
| Ad Group Ad | adGroup | String | \$1=, = | 
| Ad Group Ad | costMicros | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Ad Group Ad | approvalStatus | String | \$1=, = | 
| Ad Group Ad | reviewStatus | String | \$1=, = | 
| Ad Group Ad | adStrength | String | \$1=, = | 
| Ad Group Ad | type | String | \$1=, = | 
| Ad Group Ad | businessName | String | \$1=, =, LIKE | 
| Ad Group Ad | date | Date | BETWEEN, =, <, >, <=, >= | 
| Ad Group Ad | allConversionsByConversionDate | Double | \$1=, =, <, > | 
| Ad Group Ad | allConversionsValueByConversionDate | Double | \$1=, =, <, > | 
| Ad Group Ad | conversionsByConversionDate | Double | \$1=, =, <, > | 
| Ad Group Ad | conversionsValueByConversionDate | Double | \$1=, =, <, > | 
| Ad Group Ad | valuePerAllConversionsByConversionDate | Double | \$1=, =, <, > | 
| Ad Group Ad | valuePerConversionsByConversionDate | Double | \$1=, =, <, > | 
| Ad Group Ad | activeViewMeasurableCostMicros | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Ad Group Ad | averageCost | Double | \$1=, =, <, > | 
| Ad Group Ad | costPerAllConversions | Double | \$1=, =, <, > | 
| Ad Group Ad | costPerConversion | Double | \$1=, =, <, > | 
| Ad Group Ad | clicks | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Ad Group Ad | averagePageViews | Double | \$1=, =, <, > | 
| Ad Group Ad | videoViews | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Ad Group Ad | allConversions | Double | \$1=, =, <, > | 
| Ad Group Ad | averageCpc | Double | \$1=, =, <, > | 
| Ad Group Ad | topImpressionPercentage | Double | \$1=, =, <, > | 
| Ad Group Ad | impressions | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Ad Group Ad | absoluteTopImpressionPercentage | Double | \$1=, =, <, > | 
| Ad Group Ad | activeViewCtr | Double | \$1=, =, <, > | 
| Ad Group Ad | ctr | Double | \$1=, =, <, > | 
| Ad Group Ad | conversionAction | String | \$1=, = | 
| Ad Group Ad | conversionActionCategory | String | \$1=, = | 
| Ad Group Ad | conversionActionName | String | \$1=, =, LIKE | 
| Ad Group Ad | updateMask | String |  | 
| Ad Group Ad | create | Struct |  | 
| Ad Group Ad | update | Struct |  | 
| Ad Group Ad | policyValidationParameter | Struct |  | 
| Ad Group Ad | primaryStatus | String | \$1=, = | 
| Ad Group Ad | primaryStatusReasons | List |  | 
| Campaign | resourceName | String | \$1=, = | 
| Campaign | status | String | \$1=, = | 
| Campaign | baseCampaign | String | \$1=, = | 
| Campaign | name | String | \$1=, =, LIKE | 
| Campaign | id | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Campaign | campaignBudget | String | \$1=, =, LIKE | 
| Campaign | startDate | Date | BETWEEN, =, <, >, <=, >= | 
| Campaign | endDate | Date | BETWEEN, =, <, >, <=, >= | 
| Campaign | adServingOptimizationStatus | String | \$1=, = | 
| Campaign | advertisingChannelType | String | \$1=, = | 
| Campaign | advertisingChannelSubType | String | \$1=, = | 
| Campaign | experimentType | String | \$1=, = | 
| Campaign | servingStatus | String | \$1=, = | 
| Campaign | biddingStrategyType | String | \$1=, = | 
| Campaign | domainName | String | \$1=, =, LIKE | 
| Campaign | languageCode | String | \$1=, =, LIKE | 
| Campaign | useSuppliedUrlsOnly | Boolean | \$1=, = | 
| Campaign | positiveGeoTargetType | String | \$1=, = | 
| Campaign | negativeGeoTargetType | String | \$1=, = | 
| Campaign | paymentMode | String | \$1=, = | 
| Campaign | optimizationGoalTypes | List |  | 
| Campaign | date | Date | BETWEEN, =, <, >, <=, >= | 
| Campaign | averageCost | Double |  | 
| Campaign | clicks | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Campaign | costMicros | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Campaign | impressions | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Campaign | useAudienceGrouped | Boolean | \$1=, = | 
| Campaign | activeViewMeasurableCostMicros | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Campaign | costPerAllConversions | Double | \$1=, =, <, > | 
| Campaign | costPerConversion | Double | \$1=, =, <, > | 
| Campaign | invalidClicks | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Campaign | publisherPurchasedClicks | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Campaign | averagePageViews | Double | \$1=, =, <, > | 
| Campaign | videoViews | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Campaign | allConversionsByConversionDate | Double | \$1=, =, <, > | 
| Campaign | allConversionsValueByConversionDate | Double | \$1=, =, <, > | 
| Campaign | conversionsByConversionDate | Double | \$1=, =, <, > | 
| Campaign | conversionsValueByConversionDate | Double | \$1=, =, <, > | 
| Campaign | valuePerAllConversionsByConversionDate | Double | \$1=, =, <, > | 
| Campaign | valuePerConversionsByConversionDate | Double | \$1=, =, <, > | 
| Campaign | allConversions | Double | \$1=, =, <, > | 
| Campaign | absoluteTopImpressionPercentage | Double | \$1=, =, <, > | 
| Campaign | searchAbsoluteTopImpressionShare | Double | \$1=, =, <, > | 
| Campaign | averageCpc | Double | \$1=, =, <, > | 
| Campaign | searchImpressionShare | Double | \$1=, =, <, > | 
| Campaign | searchTopImpressionShare | Double | \$1=, =, <, > | 
| Campaign | activeViewCtr | Double | \$1=, =, <, > | 
| Campaign | ctr | Double | \$1=, =, <, > | 
| Campaign | relativeCtr | Double | \$1=, =, <, > | 
| Campaign | updateMask | String |  | 
| Campaign | create | Struct |  | 
| Campaign | update | Struct |  | 
| Campaign Budget | resourceName | String | \$1=, = | 
| Campaign Budget | id | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Campaign Budget | status | String | \$1=, = | 
| Campaign Budget | deliveryMethod | String | \$1=, = | 
| Campaign Budget | period | String | \$1=, = | 
| Campaign Budget | type | String | \$1=, = | 
| Campaign Budget | name | String | \$1=, =, LIKE | 
| Campaign Budget | amountMicros | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Campaign Budget | explicitlyShared | Boolean | \$1=, = | 
| Campaign Budget | referenceCount | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Campaign Budget | hasRecommendedBudget | Boolean | \$1=, = | 
| Campaign Budget | date | Date | BETWEEN, =, <, >, <=, >= | 
| Campaign Budget | costMicros | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Campaign Budget | startDate | Date | BETWEEN, =, <, >, <=, >= | 
| Campaign Budget | endDate | Date | BETWEEN, =, <, >, <=, >= | 
| Campaign Budget | maximizeConversionValueTargetRoas | Double | \$1=, =, <, > | 
| Campaign Budget | maximizeConversionsTargetCpaMicros | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Campaign Budget | selectiveOptimizationConversionActions | String |  | 
| Campaign Budget | averageCost | Double | \$1=, =, <, > | 
| Campaign Budget | costPerAllConversions | Double | \$1=, =, <, > | 
| Campaign Budget | costPerConversion | Double | \$1=, =, <, > | 
| Campaign Budget | videoViews | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Campaign Budget | clicks | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Campaign Budget | allConversions | Double | \$1=, =, <, > | 
| Campaign Budget | valuePerAllConversions | Double | \$1=, =, <, > | 
| Campaign Budget | averageCpc | Double | \$1=, =, <, > | 
| Campaign Budget | impressions | BigInteger | BETWEEN, =, \$1=, <, >, <=, >= | 
| Campaign Budget | ctr | Double | \$1=, =, <, > | 
| Campaign Budget | updateMask | String |  | 
| Campaign Budget | create | Struct |  | 
| Campaign Budget | update | Struct |  | 

 **Partitioning queries** 

 Additional spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, `NUM_PARTITIONS` can be provided if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by spark tasks concurrently. 
+  `PARTITION_FIELD`: the name of the field to be used to partition query. 
+  `LOWER_BOUND`: an inclusive lower bound value of the chosen partition field. 

   For date, we accept the Spark date format used in Spark SQL queries. Example of valid values: `"2024-02-06"`. 
+  `UPPER_BOUND`: an exclusive upper bound value of the chosen partition field. 
+  `NUM_PARTITIONS`: number of partitions. 

 Entity-wise partitioning field support details are captured in the following table. 


| Entity Name | Partitioning Field | Data Type | 
| --- | --- | --- | 
| Ad Group Ad | date | Date | 
| Ad Group | date | Date | 
| Campaign | date | Date | 
| Campaign Budget | date | Date | 

 **Example** 

```
googleads_read = glueContext.create_dynamic_frame.from_options(
    connection_type="googleads",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "campaign-3467***",
        "API_VERSION": "v16",
        "PARTITION_FIELD": "date"
        "LOWER_BOUND": "2024-01-01"
        "UPPER_BOUND": "2024-06-05"
        "NUM_PARTITIONS": "10"
    }
)
```

# Google Ads connection options


The following are connection options for Google Ads:
+  `ENTITY_NAME`(String) - (Required) Used for Read/Write. The name of your Object in Google Ads. 
+  `API_VERSION`(String) - (Required) Used for Read/Write. Google Ads Rest API version you want to use. Example: v16. 
+  `DEVELOPER_TOKEN`(String) - (Required) Used for Read/Write. Required to authenticate the developer or application making requests to the API. 
+  `MANAGER_ID`(String) - Used for Read/Write. A unique identifier that allows you to manage multiple Google Ads accounts. This is the customer ID of the authorized manager. If your access to the customer account is through a manager account, the `MANAGER_ID` is required. For more information, see [login-customer-id](https://developers.google.com/google-ads/api/docs/concepts/call-structure#cid). 
+  `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object. 
+  `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format. 
+  `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query. 
+  `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query. 
+  `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field. 
+  `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field. 
+  `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read. 

# Creating a Google Ads account


1.  Log in to [Google Ads Developer Account](https://console.cloud.google.com) with your credentials, and go to \$1MyProject.   
![\[The screenshot shows the welcome screen to log in to the Google Ads Developer Account.\]](http://docs.aws.amazon.com/glue/latest/dg/images/google-ads-log-in-developer-account.png)

1.  Choose **New Project** and provide the information which is required for creating Google project if you don't have any registered application in it.   
![\[The screenshot shows the select a project page. Choose New Project in the upper right hand corner.\]](http://docs.aws.amazon.com/glue/latest/dg/images/google-ads-new-project.png)  
![\[The screenshot shows the New Project window to enter a project name and choose a location.\]](http://docs.aws.amazon.com/glue/latest/dg/images/google-ads-new-project-name-location.png)

1.  Choose the **Navigation Tab**, then **API and Setting**, and **Create Client Id** and **ClientSecret** which will require further configuration for creating a connection between AWS Glue and GoogleAds. For more information, see [API credentials](https://console.cloud.google.com/apis/credentials).   
![\[The screenshot shows the APIs and services configuration page.\]](http://docs.aws.amazon.com/glue/latest/dg/images/google-ads-apis-and-services.png)

1.  Choose **CREATE CREDENTIALS** and choose **OAuth client ID**.   
![\[The screenshot shows the APIs and services configuration page with the Create Credentials drop-down and the Oauth client ID option highlighted.\]](http://docs.aws.amazon.com/glue/latest/dg/images/google-ads-create-credentials.png)

1.  Select the **Application type** as **Web application**.   
![\[The screenshot shows the Create OAuth client ID page and the Application type as Web application.\]](http://docs.aws.amazon.com/glue/latest/dg/images/google-ads-oauth-client-id-application-type.png)

1.  Under **Authorised redirect URIs**, add the OAuth Redirect URIs and choose **Create**. You can add multiple redirect URIs if required.   
![\[The screenshot shows the Create OAuth client ID page and the Authorised redirect URIs section. Here, add the URIs and choose ADD URI if needed. Once done, choose CREATE.\]](http://docs.aws.amazon.com/glue/latest/dg/images/google-ads-oauth-redirect-uris.png)

1.  Your **Client Id** and **Client Secret** will be generated when creating a connection between AWS Glue and Google Ads.   
![\[The screenshot shows the Create OAuth client ID page and the Authorised redirect URIs section. Here, add the URIs and choose ADD URI if needed. Once done, choose CREATE.\]](http://docs.aws.amazon.com/glue/latest/dg/images/google-ads-oauth-client-created.png)

1.  Add the scopes according to your application need based, choose **OAuth consent screen** and provide the required information and add the scopes based on requirements.   
![\[The screenshot shows the Update selected scopes page. Select your scopes as needed.\]](http://docs.aws.amazon.com/glue/latest/dg/images/google-ads-selected-scopes.png)

# Limitations


The following are limitations for the Google Ads connector:
+ `MANAGER_ID` is an optional input when creating a connection. But when you want to access the customers underlying any particular manager, then `MANAGER_ID` is a mandatory input. The table below explains the access limitations based on whether `MANAGER_ID` is included or not in a connection.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/googleads-connector-limitations.html)
+ When a manager account is chosen as the object, only `Account` will appear as a sub-object. In the Google Ads connector, entities such as campaigns, ads, etc., are retrieved based on individual client accounts, not the manager account.
+ You cannot retrieve metrics for the manager account. You can retrieve metrics for individual client accounts instead.
+  Each account can have up to 10,000 campaigns, including both active and paused campaigns. For more information, see [Campaign per account](https://support.google.com/google-ads/answer/6372658). 
+  When creating a report, if you choose certain metrics to display, any rows whose selected metrics are all zero won't be returned. For more information, see [ Zero Metrics ](https://developers.google.com/google-ads/api/docs/reporting/zero-metrics?hl=en#exclude_zero_metrics_by_segmenting). 
+  With the following fields, the Full Mapping flow will not work for Account, Ad Group and Ad Group Ad entities, specifically for conversionAction, conversionActionCategory, conversionActionName. For more information, see [ Segment and Metrics ](https://developers.google.com/google-ads/api/docs/reporting/segmentation?hl=en#selectability_between_segments_and_metrics). 
+ A date range filter is mandatory when the `segments.date` field is selected.

# Connecting to Google Analytics 4


 Google Analytics 4 is an analytics service that tracks and reports metrics about visitor interactions with your apps and websites. These metrics include page views, active users, and events. If you are a Google Analytics 4 user, you can connect AWS Glue to your Google Analytics 4 account. You can use Google Analytics 4 as a data source in your ETL jobs. Run these jobs to transfer data from Google Analytics 4 to AWS services or other supported applications. 

**Topics**
+ [

# AWS Glue support for Google Analytics 4
](googleanalytics-support.md)
+ [

# Policies containing the API operations for creating and using connections
](googleanalytics-configuring-iam-permissions.md)
+ [

# Configuring Google Analytics 4
](googleanalytics-configuring.md)
+ [

# Configuring Google Analytics 4 connections
](googleanalytics-configuring-connections.md)
+ [

# Reading from Google Analytics 4 entities
](googleanalytics-reading-from-entities.md)
+ [

# Google Analytics 4 connection options
](googleanalytics-connection-options.md)
+ [

# Creating a Google Analytics 4 account
](googleanalytics-create-account.md)
+ [

# Steps to create a client app and OAuth 2.0 credentials
](googleanalytics-client-app-oauth-credentials.md)
+ [

# Limitations and considerations
](googleanalytics-connector-limitations.md)

# AWS Glue support for Google Analytics 4


AWS Glue supports Google Analytics 4 as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Google Analytics 4.

**Supported as a target?**  
No.

**Supported Google Analytics 4 API versions**  
 v1 Beta. 

# Policies containing the API operations for creating and using connections
IAM policies

 The following sample policy describes the required AWS permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

You can also use the following managed IAM policies to allow access:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring Google Analytics 4


Before you can use AWS Glue to transfer from Google Analytics 4, you must meet these requirements:

## Minimum requirements

+  You have a Google Analytics account with one or more data streams that collect the data that you want to transfer. 
+  You have a Google Cloud Platform account and a Google Cloud project. 
+  In your Google Cloud project, you've enabled the following APIs: 
  +  Google Analytics API 
  +  Google Analytics Admin API 
  +  Google Analytics Data API 
+  In your Google Cloud project, you've configured an OAuth consent screen for external users. For information about the OAuth consent screen, see [Setting up your OAuth consent screen](https://support.google.com/cloud/answer/10311615#) in the Google Cloud Platform Console Help. 
+  In your Google Cloud project, you've configured an OAuth 2.0 client ID. For more information, see [Setting up OAuth 2.0 ](https://support.google.com/cloud/answer/6158849?hl=en#zippy=). 

 If you meet these requirements, you’re ready to connect AWS Glue to your Google Analytics 4 account. 

# Configuring Google Analytics 4 connections


To configure a Google Sheet connection:

1.  In AWS Secrets Manager, create a secret with the following details. It is required to create a secret for each connection in AWS Glue. 

   1.  For AuthorizationCode grant type: 
      +  For customer managed connected app – Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key. 

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below: 

   1. When selecting a **Connection type**, select Google Analytics 4.

   1. Provide the `INSTANCE_URL` of the Google Analytics 4 you want to connect to.

   1.  Select the IAM role which AWS Glue can assume and has permissions for following actions: 

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1.  Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. 

   1.  Select the network options if you want to use your network. 

1.  Grant the IAM role associated with your AWS Glue job permission to read `secretName`. 

 `AUTHORIZATION_CODE` grant type. 

 This grant type is considered “three-legged” OAuth as it relies on redirecting users to the third party authorization server to authenticate the user. It is used when creating connections via the AWS Glue Console. The AWS Glue Console will redirect the user to Google Analytics 4 where the user must login and allow AWS Glue the requested permissions to access their Google Analytics 4 instance. 

 Users may still opt to create their own connected app in Google Analytics 4 and provide their own client id and client secret when creating connections through the AWS Glue Console. In this scenario, they will still be redirected to Google Analytics 4 to login and authorize AWS Glue to access their resources. 

 This grant type results in a refresh token and access token. The access token is short lived, and may be refreshed automatically without user interaction using the refresh token. 

 For more information, see [ Using Auth 2.0 to Access Google APIs ](https://developers.google.com/identity/protocols/oauth2). 

# Reading from Google Analytics 4 entities


 **Prerequisites** 
+  A Google Analytics 4 object you would like to read from. Refer the supported entities table below to check the available entities. 

 **Supported entities** 


| Entity | Can be Filtered | Supports Limit | Supports Order By | Supports Select \$1 | Supports Partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Real-Time Report | Yes | Yes | Yes | Yes | No | 
| Core Report | Yes | Yes | Yes | Yes | Yes | 

 **Example** 

```
googleAnalytics4_read = glueContext.create_dynamic_frame.from_options(
    connection_type="GoogleAnalytics4",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "API_VERSION": "v1beta"
    }
```

 **Google Analytics 4 entity and field details** 


| Entity | Field | Data Type | Supported Operators | 
| --- | --- | --- | --- | 
| Core Report | Dynamic Fields |  |  | 
| Core Report | Dimension Fields | String | LIKE, = | 
| Core Report | Dimension Fields | Date | LIKE, = | 
| Core Report | Metric Fields | String | >, <, >=, <=, = BETWEEN | 
| Core Report | Custom Dimension and Custom Metric Fields | String | NA | 
| Real-Time Report | appVersion | String | LIKE, = | 
| Real-Time Report | audienceId | String | LIKE, = | 
| Real-Time Report | audienceName | String | LIKE, = | 
| Real-Time Report | city | String | LIKE, = | 
| Real-Time Report | cityId | String | LIKE, = | 
| Real-Time Report | country | String | LIKE, = | 
| Real-Time Report | countryId | String | LIKE, = | 
| Real-Time Report | deviceCategory | String | LIKE, = | 
| Real-Time Report | eventName | String | LIKE, = | 
| Real-Time Report | minutesAgo | String | LIKE, = | 
| Real-Time Report | platform | String | LIKE, = | 
| Real-Time Report | streamId | String | LIKE, = | 
| Real-Time Report | streamName | String | LIKE, = | 
| Real-Time Report | unifiedScreenName | String | LIKE, = | 
| Real-Time Report | activeUsers | String | >, <, >=, <=, = BETWEEN | 
| Real-Time Report | conversions | String | >, <, >=, <=, = BETWEEN | 
| Real-Time Report | eventCount | String | >, <, >=, <=, = BETWEEN | 
| Real-Time Report | screenPageViews | String | >, <, >=, <=, = BETWEEN | 

 **Partitioning queries** 

1.  **Filter-based partition** 

    Additional spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, `NUM_PARTITIONS` can be provided if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by spark tasks concurrently. 
   +  `PARTITION_FIELD`: the name of the field to be used to partition query. 
   +  `LOWER_BOUND`: an inclusive lower bound value of the chosen partition field. 

      For date, we accept the Spark date format used in Spark SQL queries. Example of valid values: `"2024-02-06"`. 
   +  `UPPER_BOUND`: an exclusive upper bound value of the chosen partition field. 
   +  `NUM_PARTITIONS`: number of partitions. 

    **Example** 

   ```
   googleAnalytics4_read = glueContext.create_dynamic_frame.from_options(
       connection_type="GoogleAnalytics4",
       connection_options={
           "connectionName": "connectionName",
           "ENTITY_NAME": "entityName",
           "API_VERSION": "v1beta",
           "PARTITION_FIELD": "date"
           "LOWER_BOUND": "2022-01-01"
           "UPPER_BOUND": "2024-01-02"
           "NUM_PARTITIONS": "10"
       }
   ```

1.  **Record-based partition** 

    Additional spark options `NUM_PARTITIONS` can be provided if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by spark tasks concurrently. 
   +  `NUM_PARTITIONS`: number of partitions. 

    **Example** 

   ```
   googleAnalytics4_read = glueContext.create_dynamic_frame.from_options(
       connection_type="GoogleAnalytics4",
       connection_options={
           "connectionName": "connectionName",
           "ENTITY_NAME": "entityName",
           "API_VERSION": "v1beta",
           "NUM_PARTITIONS": "10"
       }
   ```

# Google Analytics 4 connection options


The following are connection options for Google Analytics 4:
+  `ENTITY_NAME`(String) - (Required) Used for Read. The name of your Object in Google Analytics 4. 
+  `API_VERSION`(String) - (Required) Used for Read. Google Analytics 4 Rest API version you want to use. 
+  `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object. 
+  `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format. 
+  `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query. 
+  `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query. 
+  `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field. 
+  `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field. 
+  `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read. 
+  `INSTANCE_URL`(Integer) - Used for Read. (Optional) 

# Creating a Google Analytics 4 account


 Follow the steps to create a Google Analytics 4 account: [ https://support.google.com/analytics/answer/9304153?hl=en ](https://support.google.com/analytics/answer/9304153?hl=en) 

# Steps to create a client app and OAuth 2.0 credentials


 For more information, see [ Google Analytics4 API documentation ](https://developers.google.com/analytics/devguides/reporting/data/v1). 

1.  Create and set up your account by logging in to your [ Google Analytics Account ](https://analytics.google.com/) with your credentials. Then navigate to **Admin** > **Create Account**. 

1.  Create property for the account you have created by choosing **Create Property**. Set up the property with required details. Once all the details provided corresponding property id will be generated. 

1.  Add Data Stream for the created property by choosing **Data Streams** > **Add Stream** > **Web** from the drop-down. Provide the website details such as URL and other required fields. After providing all details, the corresponding **stream id **and **measurement id** will be generated. 

1.  Set up Google Analytics in your website by copying the measurement id and add to your website's configuration. 

1.  Create Report from Google Analytics by navigating to **Reports** and generating the required report. 

1.  Authorize your app by navigating to [ console.cloud.google.com ]( https://console.cloud.google.com) and search for Google Analytics Data API, then enable the API. 

   1.  Navigate to the API and Services page and choose **Credentials** > **setup OAuth 2.0 Client IDs**. 

   1.  Provide redirect URL by adding the AWS Glue Redirect URL. 

1.  Copy the client id and client secret which will require further for creating connection. 

# Limitations and considerations


The following are limitations for the Google Analytics 4 connector:
+  For the Core Report entity, only 9 dimension fields and 10 metric fields are allowed to send in a request. If the allowed number of fields is exceeded then request will fail and connector will throw an error message. 
+  For the Realtime Report entity, only 4 dimension fields are allowed to send in a request. If the allowed number of fields is exceeded then request will fail and connector will throw an error message. 
+  Google Analytics 4 is a beta version free tool, so there will be regular update on new feature, entities enhancement, adding new fields and deprecating existing fields. 
+  Core Report fields are populated dynamically, so there will be addition, depreciation and renaming of fields and imposing new limits on fields can be done anytime. 
+  The default start date is 30 days and the end date is yesterday (one day before the current date), and these dates will get overridden in filter expression code if user has set the value OR if the flow is incremental. 
+  As per the documentation, Real-Time report entity returns 10,000 records if limit is not pass in the request, otherwise the API returns a maximum of 250,000 rows per request, no matter how many you ask for. For more information see [ Method: properties.runRealtimeReport ](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/properties/runRealtimeReport) in the Google Analytics documentation. 
+  Real-Time Report entity does not support Record Based Partition as it does not support pagination. Also, it does not support Field Based Partition as none of the fields fulfill the criteria defined. 
+  Due to the limitation on number of fields that can be passed in a request. We are setting default dimension and metric fields within the designated limits. If "select all" is chosen, only the data from those predetermined fields will be retrieved. 
  +  Core Report 
    +  As per limitation from SAAS - requests are allowed up to 9 dimensions and up to 10 metrics only (that is, a request can contain a maximum of 19 fields(metrics \$1 dimension). 
    +  As per the implementation - If user utilizes SELECT\$1ALL or selected fields more than 25, then default fields will be pass in the request. 
    +  The following fields are considered as default fields for Core Report - "country", "city", "eventName", "cityId", "browser", "date", "currencyCode", "deviceCategory", "transactionId", active1DayUsers", "active28DayUsers", "active7DayUsers", "activeUsers", "averagePurchaseRevenue", "averageRevenuePerUser", "averageSessionDuration", "engagedSessions", "eventCount", "engagementRate". 
  +  Real-Time Report 
    +  As per limitation from SAAS requests are allowed up to 4 dimensions. 
    +  If user pass SELECT\$1ALL or selected fields more than 15, then default fields will be pass in the request. 
    +  The following fields are considered as default fields for RealTime Report - "country", "deviceCategory", "city", "cityId", "activeUsers", "conversions", "eventCount", "screenPageViews". 
+  In Core-Report entity, if partition on date field and filter on startDate is present simultaneously. In that case dateRange value gets overridden with the startDate filter value, But, since partition must always be the priority, hence discarding startDate filter if partition on date field is already present. 
+  As now cohortSpecs is also a part of core-report request body we enhanced the current core-report entity to include support for the cohortSpec attribute. In cohortSpecs request body, nearly all fields require user input. To address this, we have set default values for those attributes/fields and provided provision for user to override these values if needed.     
<a name="google-analytics-connector-limitations-table"></a>[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/googleanalytics-connector-limitations.html)
+  You can also pass all these filters together at once or with other filters. 
  +  Example 1 - filterPredicate: startDate between "2023-05-09" and "2023-05-10" AND startOffset=1 AND endOffset=2 AND granularity="WEEKLY" 
  +  Example 2 - filterPredicate: city=“xyz” AND startOffset=1 AND endOffset=2 AND granularity="WEEKLY" 
+  In cohort request: 
  +  If ‘cohortNthMonth’ is passed in the request, then internally granularity value will be set as “MONTHLY” 
  +  Similarly, if ‘cohortNthWeek’ is passed, then granularity value will be set as “WEEKLY” 
  +  And, for ‘cohortNthDay’ granularity value will be set as “DAILY”. For more information, see: 
    +  [ https://developers.google.com/analytics/devguides/reporting/data/v1/advanced ](https://developers.google.com/analytics/devguides/reporting/data/v1/advanced) 
    +  [ https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/CohortSpec ](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/CohortSpec) 
  +  Provision is given for the user to override dateRange and granularity default value. Refer to the above table. 

# Connecting to Google BigQuery in AWS Glue Studio
Connecting to Google BigQuery

**Note**  
  You can use AWS Glue for Spark to read from and write to tables in Google BigQuery in AWS Glue 4.0 and later versions. To configure Google BigQuery with AWS Glue jobs programmatically, see  [BigQuery connections](aws-glue-programming-etl-connect-bigquery-home.md).  

 AWS Glue Studio provides a visual interface to connect to BigQuery, author data integration jobs, and run them on the AWS Glue Studio serverless Spark runtime. 

 When creating a connection to Google BigQuery in AWS Glue Studio, a unified connection is created. For more information, see [Considerations](using-connectors-unified-connections.md#using-connectors-unified-connections-considerations). 

 Instead of creating a secret with the credentials in a specific format, `{"credentials": "base64 encoded JSON"}`, now with unified connection to Google BigQuery, you can create a secret which directly includes the JSON from Goolge BigQuery: `{"type": "service-account", ...}`. 

**Topics**
+ [

# Creating a BigQuery connection
](creating-bigquery-connection.md)
+ [

# Creating a BigQuery source node
](creating-bigquery-source-node.md)
+ [

# Creating a BigQuery target node
](creating-bigquery-target-node.md)
+ [

## Advanced options
](#creating-bigquery-connection-advanced-options)

# Creating a BigQuery connection


To connect to Google BigQuery from AWS Glue, you will need to create and store your Google Cloud Platform credentials in a AWS Secrets Manager secret, then associate that secret with a Google BigQuery AWS Glue connection.

**To configure a connection to BigQuery:**

1. In Google Cloud Platform, create and identify relevant resources:
   + Create or identify a GCP project containing BigQuery tables you would like to connect to.
   + Enable the BigQuery API. For more information, see [ Use the BigQuery Storage Read API to read table data ](https://cloud.google.com/bigquery/docs/reference/storage/#enabling_the_api).

1. In Google Cloud Platform, create and export service account credentials:

   You can use the BigQuery credentials wizard to expedite this step: [Create credentials](https://console.cloud.google.com/apis/credentials/wizard?api=bigquery.googleapis.com).

   To create a service account in GCP, follow the tutorial available in [Create service accounts](https://cloud.google.com/iam/docs/service-accounts-create).
   + When selecting **project**, select the project containing your BigQuery table.
   + When selecting GCP IAM roles for your service account, add or create a role that would grant appropriate permissions to run BigQuery jobs to read, write or create BigQuery tables.

   To create credentials for your service account, follow the tutorial available in [Create a service account key](https://cloud.google.com/iam/docs/keys-create-delete#creating).
   + When selecting key type, select **JSON**.

   You should now have downloaded a JSON file with credentials for your service account. It should look similar to the following:

   ```
   {
     "type": "service_account",
     "project_id": "*****",
     "private_key_id": "*****",
     "private_key": "*****",
     "client_email": "*****",
     "client_id": "*****",
     "auth_uri": "https://accounts.google.com/o/oauth2/auth",
     "token_uri": "https://oauth2.googleapis.com/token",
     "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
     "client_x509_cert_url": "*****",
     "universe_domain": "googleapis.com"
   }
   ```

1. In AWS Secrets Manager, create a secret using your downloaded credentials file. You can choose the **Plaintext** tab and paste the JSON formatted file content. To create a secret in Secrets Manager, follow the tutorial available in [ Create an AWS Secrets Manager secret ](https://docs.aws.amazon.com//secretsmanager/latest/userguide/create_secret.html) in the AWS Secrets Manager documentation. After creating the secret, keep the Secret name, *secretName* for the next step. 

1. In the AWS Glue Data Catalog, create a connection by following the steps in [https://docs.aws.amazon.com/glue/latest/dg/console-connections.html](https://docs.aws.amazon.com/glue/latest/dg/console-connections.html). After creating the connection, keep the connection name, *connectionName*, for the next step. 
   + When selecting a **Connection type**, select Google BigQuery.
   + When selecting an **AWS Secret**, provide *secretName*.

1. Grant the IAM role associated with your AWS Glue job permission to read *secretName*.

1. In your AWS Glue job configuration, provide *connectionName* as an **Additional network connection**.

# Creating a BigQuery source node


## Prerequisites needed

+ A BigQuery type AWS Glue Data Catalog connection
+ An AWS Secrets Manager secret for your Google BigQuery credentials, used by the connection.
+ Appropriate permissions on your job to read the secret used by the connection.
+ The name and dataset of the table and corresponding Google Cloud project you would like to read.

## Adding a BigQuery data source


**To add a **Data source – BigQuery** node:**

1.  Choose the connection for your BigQuery data source. Since you have created it, it should be available in the dropdown. If you need to create a connection, choose **Create BigQuery connection**. For more information, see [ Overview of using connectors and connections ](https://docs.aws.amazon.com/glue/latest/ug/connectors-chapter.html#using-connectors-overview). 

    Once you have chosen a connection, you can view the connection properties by clicking **View properties**. 

1. Identify what BigQuery data you would like to read, then choose a **BigQuery Source** option
   + Choose a single table – allows you to pull all data from a table.
   + Enter a custom query – allows you to customize which data is retrieved by providing a query.

1.  Describe the data you would like to read

   **(Required) **set **Parent Project** to the project containing your table, or a billing parent project, if relevant.

   If you chose a single table, set **Table** to the name of a Google BigQuery table in the following format: `[dataset].[table]` 

   If you chose a query, provide it to **Query**. In your query, refer to tables with their fully qualified table name, in the format: `[project].[dataset].[tableName]`.

1.  Provide BigQuery properties 

   If you chose a single table, you do not need to provide additional properties.

   If you chose a query, you must provide the following **Custom Google BigQuery properties**:
   + Set `viewsEnabled` to true.
   + Set `materializationDataset` to a dataset. The GCP principal authenticated by the credentials provided through the AWS Glue connection must be able to create tables in this dataset.

# Creating a BigQuery target node


## Prerequisites needed

+ A BigQuery type AWS Glue Data Catalog connection
+ An AWS Secrets Manager secret for your Google BigQuery credentials, used by the connection.
+ Appropriate permissions on your job to read the secret used by the connection.
+ The name and dataset of the table and corresponding Google Cloud project you would like to write to.

## Adding a BigQuery data target


**To add a **Data target – BigQuery** node:**

1.  Choose the connection for your BigQuery data target. Since you have created it, it should be available in the dropdown. If you need to create a connection, choose **Create BigQuery connection**. For more information, see [ Overview of using connectors and connections ](https://docs.aws.amazon.com/glue/latest/ug/connectors-chapter.html#using-connectors-overview). 

    Once you have chosen a connection, you can view the connection properties by clicking **View properties**. 

1. Identify what BigQuery table you would like to write to, then choose a **Write method**.
   + Direct – writes to BigQuery directly using the BigQuery Storage Write API.
   + Indirect – writes to Google Cloud Storage, then copies to BigQuery.

   If you would like to write indirectly, provide a destination GCS location with **Temporary GCS bucket**. You will need to provide additional configuration in your AWS Glue connection. For more information, see [Using indirect write with Google BigQuery](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-connect-bigquery-home.html#aws-glue-programming-etl-connect-bigquery-indirect-write).

1.  Describe the data you would like to read

   **(Required) **set **Parent Project** to the project containing your table, or a billing parent project, if relevant.

   If you chose a single table, set **Table** to the name of a Google BigQuery table in the following format: `[dataset].[table]` 

## Advanced options


You can provide advanced options when creating a BigQuery node. These options are the same as those available when programming AWS Glue for Spark scripts.

See [ BigQuery connection option reference ](https://docs.aws.amazon.com//glue/latest/dg/aws-glue-programming-etl-connect-bigquery-home.html) in the AWS Glue developer guide. 

# Connecting to Google Search Console
Connecting to Google Search Console

Google Search Console is a free platform available to website owners for monitoring how Google views the site and optimizing its organic presence. This includes viewing referring domains, mobile site performance, rich search results, and the highest-traffic queries and pages. If you are a Google Search Console user, you can connect AWS Glue to your Google Search Console account. You can use Google Search Console as a data source in your ETL jobs. Run these jobs to transfer data from Google Search Console to AWS services or other supported applications.

**Topics**
+ [

# AWS Glue support for Google Search Console
](google-search-console-support.md)
+ [

# Policies containing the API operations for creating and using connections
](google-search-console-configuring-iam-permissions.md)
+ [

# Configuring Google Search Console
](google-search-console-configuring.md)
+ [

# Configuring Google Search Console connections
](google-search-console-configuring-connections.md)
+ [

# Reading from Google Search Console entities
](google-search-console-reading-from-entities.md)
+ [

# Google Search Console connection options
](google-search-console-connection-options.md)
+ [

# Google Search Console limitations
](google-search-console-limitations.md)

# AWS Glue support for Google Search Console


AWS Glue supports Google Search Console as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Google Search Console.

**Supported as a target?**  
No.

**Supported Google Search Console API versions**  
The following Google Search Console API versions are supported:
+ v3

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring Google Search Console


Before you can use AWS Glue to transfer data from Google Search Console, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have a Google Search Console account.
+ You have a Google Cloud Platform account and a Google Cloud project.
+ In your Google Cloud project, you've enabled the Google Search Console API.
+ In your Google Cloud project, you've configured an OAuth consent screen for external users. For more information, see [Setting up your OAuth consent screen ](https://support.google.com/cloud/answer/10311615) in the Google Cloud Platform Console Help.
+ In your Google Cloud project, you've configured an OAuth 2.0 client ID. See [Setting up OAuth 2.0](https://support.google.com/cloud/answer/6158849) for the client credentials that AWS Glue uses to access your data securely when it makes authenticated calls to your account.

If you meet these requirements, you’re ready to connect AWS Glue to your Google Search Console account. For typical connections, you don't need do anything else in Google Search Console.

# Configuring Google Search Console connections


Google Search Console supports the AUTHORIZATION\$1CODE grant type for OAuth2. The grant type determines how AWS Glue communicates with Google Search Console to request access to your data.
+ This grant type is considered "three-legged" OAuth as it relies on redirecting users to a third-party authorization server to authenticate the user. It is used when creating connections via the AWS Glue console.
+ Users may still opt to create their own connected app in Google Search Console and provide their own client ID and client secret when creating connections through the AWS Glue console. In this scenario, they will still be redirected to Google Search Console to login and authorize AWS Glue to access their resources.
+ This grant type results in a refresh token and access token. The access token is short lived, and may be refreshed automatically without user interaction using the refresh token.
+ For public Google Search Console documentation on creating a connected app for Authorization Code OAuth flow, see [Using OAuth 2.0 to Access Google APIs](https://developers.google.com/identity/protocols/oauth2).

To configure a Google Search Console connection:

1. In AWS Secrets Manager, create a secret with the following details:

   1. For the customer managed connected app, the Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key.

   1. Note: you must create a secret for your connections in AWS Glue.

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Connection type**, select Google Search Console.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

# Reading from Google Search Console entities


**Prerequisite**

A Google Search Console object you would like to read from. You will need the object name.

**Supported entities for source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Search Analytics | Yes | Yes | No | Yes | No | 
| Sites | No | No | No | Yes | No | 
| Sitemaps | No | No | No | Yes | No | 

**Example**:

```
googleSearchConsole_read = glueContext.create_dynamic_frame.from_options(
    connection_type="googlesearchconsole",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "API_VERSION": "v3"
    }
```

**Google Search Console entity and field details**:

Google Search Console provides endpoints to fetch metadata dynamically for supported entities. Accordingly, operator support is captured at the datatype level.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/google-search-console-reading-from-entities.html)

**Note**  
For an updated list of valid values for filters, see the [Google Search Console](https://developers.google.com/webmaster-tools/v1/searchanalytics/query) API docs.  
The field `start_end_date` is a combination of `start_date` and `end_date`.

## Partitioning queries


Filter-based partitioning and record-based partitioning are not supported.

# Google Search Console connection options


The following are connection options for Google Search Console:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in Google Search Console.
+ `API_VERSION`(String) - (Required) Used for Read. Google Search Console Rest API version you want to use.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: "start\$1end\$1date between <30 days ago from current date> AND <yesterday: that is, 1 day ago from the current date>". Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: "start\$1end\$1date between <30 days ago from current date> AND <yesterday: that is, 1 day ago from the current date>" Used for Read. Full Spark SQL query.
+ `INSTANCE_URL`(String) - Used for Read. A valid Google Search Console instance URL.

# Google Search Console limitations


The following are limitations or notes for Google Search Console:
+ Google Search Console enforces usage limits on the API. For more information, see [Usage Limits](https://developers.google.com/webmaster-tools/limits).
+ When no filter is passed for the `Search Analytics` entity, the API sums up all the clicks, impressions, CTR, and other data for your entire site within the specified default date range and presents it as a single record.
+ To breakdown the data into smaller segments, you need to introduce dimensions to your query. Dimensions tells the API how you want to segment your data.
  + For example, if you add `filterPredicate: dimensions="country"` you'll get one record for each country where your site received traffic during the specified period.
  + Example to pass multiple dimensions: `filterPredicate: dimensions="country" AND dimensions="device" AND dimensions="page"`. In this case you'll get one row in the response for each unique combination of these three dimensions.
+ Default values are set for the `start_end_date` and `dataState` fields.     
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/google-search-console-limitations.html)

# Connecting to Google Sheets


 Google Sheets is an online spreadsheet software that allows you to organize large amounts of data, create custom reports, automate calculations, and collaborate with others.If you're a Google Sheets user, you can connect AWS Glue to your Google Sheets account. Then, you can use Google Sheets as a data source in your ETL jobs. Run these jobs to transfer data between Google Sheets and AWS services or other supported applications. 

**Topics**
+ [

# AWS Glue support for Google Sheets
](googlesheets-support.md)
+ [

# Policies containing the API operations for creating and using connections
](googlesheets-configuring-iam-permissions.md)
+ [

# Configuring Google Sheets
](googlesheets-configuring.md)
+ [

# Configuring Google Sheets connections
](googlesheets-configuring-connections.md)
+ [

# Reading from Google Sheets entities
](googlesheets-reading-from-entities.md)
+ [

# Google Sheets connection options
](googlesheets-connection-options.md)
+ [

# Set up Authorization code OAuth flow for Google Sheets
](googlesheets-oauth-authorization.md)
+ [

# Limitations for Google Sheets connector
](googlesheets-connector-limitations.md)

# AWS Glue support for Google Sheets


AWS Glue supports Google Sheets as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Google Sheets.

**Supported as a target?**  
No.

**Supported Google Sheets API versions**  
 Google Sheets API v4 and Google Drive API v3 

# Policies containing the API operations for creating and using connections
IAM policies

 The following sample policy describes the required AWS permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

You can also use the following managed IAM policies to allow access:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring Google Sheets


Before you can use AWS Glue to transfer from Google Sheets, you must meet these requirements:

## Minimum requirements

+ You have a Google Sheets account with Email and Password.
+  Your Google Sheets account is enabled for API access. All use of the Google Sheets API is available at no additional cost. 
+  Your Google Sheets account allows you to install connected apps. If you lack access to this functionality, contact your Google Sheets administrator. 

 If you meet these requirements, you’re ready to connect AWS Glue to your Google Sheets account. 

# Configuring Google Sheets connections


To configure a Google Sheet connection:

1. In AWS Secrets Manager, create a secret with the following details: 

   1.  For AuthorizationCode grant type: 
      +  For customer managed connected app – Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key. 

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below: 

   1. When selecting a **Data Source**, select Google Sheets.

   1. Provide the Google Sheets environment.

      1.  Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. 

      1.  Select the network options if you want to use your network. 

1.  Grant the IAM role associated with your AWS Glue job permission to read `secretName`. 

------
#### [ JSON ]

****  

   ```
   {
     "Version":"2012-10-17",		 	 	 
     "Statement": [
       {
         "Effect": "Allow",
         "Action": [
           "secretsmanager:DescribeSecret",
           "secretsmanager:GetSecretValue",
           "secretsmanager:PutSecretValue",
           "ec2:CreateNetworkInterface",
           "ec2:DescribeNetworkInterfaces",
           "ec2:DeleteNetworkInterface"
         ],
         "Resource": "*"
       }
     ]
   }
   ```

------

 **AUTHORIZATION\$1CODE Grant Type** 

 This grant type is considered “three-legged” OAuth as it relies on redirecting users to the third party authorization server to authenticate the user. It is used when creating connections via the AWS Glue Console. The AWS Glue Console will redirect the user to Google Sheets where the user must login and allow AWS Glue the requested permissions to access their Google Sheets instance. 

 Users may opt to create their own connected app in Google Sheets and provide their own client id and client secret when creating connections through the AWS Glue Console. In this scenario, they will still be redirected to Google Sheets to login and authorize AWS Glue to access their resources. 

 This grant type results in a refresh token and access token. The access token is short lived, and may be refreshed automatically without user interaction using the refresh token. 

 For more information, see [ public Google Sheets documentation on creating a connected app for Authorization Code OAuth flow ](https://developers.google.com/workspace/guides/create-credentials). 

# Reading from Google Sheets entities


 **Prerequisites** 
+  A Google SpreadSheet that you would like to read from. You will need the SpreadSheet ID and tabName of the spreadsheet. 

 **Google Sheets Entity and Field Details:** 


| Entity | Data Type | Supported Operators | 
| --- | --- | --- | 
| Spreadsheet | String | N/A (filter is not supported) | 

 **Example** 

```
googleSheets_read = glueContext.create_dynamic_frame.from_options(
    connection_type="googlesheets",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "{SpreadSheetID}#{SheetTabName}",
        "API_VERSION": "v4"
    }
```

 **Partitioning queries** 

 For Record Base Partitioning only, `NUM_PARTITIONS` can be provided as additional spark options if you want to utilize concurrency in Spark. With this parameter, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by spark tasks concurrently. 

 **Example with `NUM_PARTITIONS`** 

```
googlesheets_read = glueContext.create_dynamic_frame.from_options(
    connection_type="googlesheets",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "{SpreadSheetID}#{SheetTabName}",
        "API_VERSION": "v4",
        "NUM_PARTITIONS": "10"
    }
```

# Google Sheets connection options


The following are connection options for Google Sheets:
+  `ENTITY_NAME`(String) - (Required) Used for Read. The `SpreadSheet ID` and `sheetTabName` in Google Sheets. Example: `{SpreadSheetID}#{SheetTabName}`. 
+  `API_VERSION`(String) - (Required) Used for Read. Google Sheets Rest API version you want to use. 
+  `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object. 
+  `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format. 
+  `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query. 
+  `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read. 

# Set up Authorization code OAuth flow for Google Sheets


 **Prerequisites** 
+  A Google account where you can sign in to use the Google Sheets app. In your Google account, Google Sheets contains the data that you want to transfer. 
+  A Google Cloud Platform account and a Google Cloud project. See [ Create Google Cloud Project ](https://developers.google.com/workspace/guides/create-project) for more details. 

**To set up your Google account and get OAuth 2.0 credentials:**

1.  Once the Google Cloud project is setup, enable the Google Sheets API and Google Drive APIs in the project. For the steps to enable them, see [ Enable and disable APIs ](https://support.google.com/googleapi/answer/6158841) in the API Console Help for Google Cloud Platform. 

1.  Next, configure an OAuth consent screen for external users. For more information about the OAuth consent screen, see [ Setting up your OAuth consent screen ](https://support.google.com/cloud/answer/10311615#) in the Google Cloud Platform Console Help. 

1.  In the OAuth consent screen, add the following scopes: 
   +  [ The Google Sheets API read-only scope ](https://www.googleapis.com/auth/spreadsheets.readonly) 
   +  [ The Google Drive API read-only scope ](ttps://www.googleapis.com/auth/drive.readonly) 

    For more information about these scopes, see [ OAuth 2.0 Scopes for Google APIs ](https://developers.google.com/identity/protocols/oauth2/scopes) in the Google Identity documentation. 

1.  Generate OAuth 2.0 client ID and secret. For the steps to create this client ID, see [ Setting up OAuth 2.0 ](https://support.google.com/cloud/answer/6158849?hl=en#zippy=) in the Google Cloud Platform Console Help. 

    The OAuth 2.0 client ID must have one or more authorized redirect URLs. 

    Redirect URLs have the following format: 
   + https://<aws-region>.console.aws.amazon.com/gluestudio/oauth 

1.  Note the client ID and client secret from the settings for your OAuth 2.0 client ID. 

# Limitations for Google Sheets connector


The following are limitations for the Google Sheets connector:
+  Google Sheets connector does not support Filters. Hence, filter based partitioning cannot be supported. 
+  In Record Base Partitioning, there is no provision to return exact record count by SAAS. As a result, there can be scenarios where files with empty records are created.
+  Since the Google Sheets connector does not support filter-based partitioning, `partitionField`, `lowerbound`, and `upperbound` are not valid connection options. If these options are provided, the AWS Glue job is expected to fail. 
+  It is essential to designate the first row of the sheet as the header row to avoid data processing issues. 
  +  If not provided, header row will be replaced with `Unnamed:1`, `Unnamed:2`, `Unnamed:3`... if the sheet contains data with the first row empty. 
  +  If header row is provided, empty column names will be replaced with `Unnamed:<number of column>`. For example, if header row is `['ColumnName1', 'ColumnName2', '', '', 'ColumnName5', 'ColumnName6']`, then it will become `['ColumnName1', 'ColumnName2', 'Unnamed:3', 'Unnamed:4', 'ColumnName5', 'ColumnName6'].` 
+  The Google Sheets connector does not support Incremental transfer. 
+  Google Sheets connector supports only String datatype. 
+  Duplicate headers in a sheet will be iteratively renamed with a numeric suffix. Header names provided by the user will have precedence while renaming the duplicate headers. For example, if the header row is ["Name", "", "Name", null, "Unnamed:6", ""], it will change to: ["Name", "Unnamed:2", "Name1", "Unnamed:4", "Unnamed:6", "Unnamed:61"]. 
+  Google Sheets connector does not support spaces for a tabName. 
+  A folder name can't have the following special characters: 
  + \$1
  + /

# Connecting to HubSpot
Connecting to HubSpot

HubSpot's CRM platform has all the tools and integrations you need for marketing, sales, content management, and customer service.
+ Marketing Hub - Marketing software to help you grow traffic, convert more visitors, and run complete inbound marketing campaigns at scale.
+ Sales Hub - Sales CRM software to help you get deeper insights into prospects, automate the tasks you have, and close more deals faster.
+ Service Hub - Customer service software to help you connect with customers, exceed expectations, and turn them into promoters who grow your business.
+ Operations Hub - Operations software that syncs your apps, cleans and creates customer data, and automates processes — so all your systems and teams work better together.

**Topics**
+ [

# AWS Glue support for HubSpot
](hubspot-support.md)
+ [

# Policies containing the API operations for creating and using connections
](hubspot-configuring-iam-permissions.md)
+ [

# Configuring HubSpot
](hubspot-configuring.md)
+ [

# Configuring HubSpot connections
](hubspot-configuring-connections.md)
+ [

# Reading from HubSpot entities
](hubspot-reading-from-entities.md)
+ [

# Writing to HubSpot Entities
](hubspot-writing-to-entities.md)
+ [

# HubSpot connection options
](hubspot-connection-options.md)
+ [

# Limitations and notes for HubSpot connector
](hubspot-connector-limitations.md)

# AWS Glue support for HubSpot


AWS Glue supports HubSpot as follows:

**Supported as a source?**  
Yes – Sync and Async. You can use AWS Glue ETL jobs to query data from HubSpot.

**Supported as a target?**  
Yes. You can use AWS Glue ETL jobs to write data to HubSpot.

**Supported HubSpot API versions**  
The following HubSpot API versions are supported:
+ v1
+ v2
+ v3
+ v4

For entity support per version specific, see [Supported entities for Sync source](hubspot-reading-from-entities.md#sync-table) and [Supported entities for Async source](hubspot-reading-from-entities.md#async-table).

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring HubSpot


Before you can use AWS Glue to transfer data from HubSpot, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have a HubSpot account. For more information, see [Creating a HubSpot account](#hubspot-configuring-creating-hubspot-account).
+ Your HubSpot account is enabled for API access.
+ You should have an app under your HubSpot developer account that provides the client credentials that AWS Glue uses to access your data securely when it makes authenticated calls to your account. For more information, see [Creating a HubSpot developer app](#hubspot-configuring-creating-hubspot-developer-app).

If you meet these requirements, you’re ready to connect AWS Glue to your HubSpot account. For typical connections, you don't have do anything else in HubSpot.

## Creating a HubSpot account


To create a HubSpot account:

1. Go to the [HubSpot CRM SignUp URL](https://app.hubspot.com/login).

1. Enter your email address and choose **Verify email** (alternatively, you can choose to sign up with a Google, Microsoft or Apple account).

1. Check your inbox for the verification code from HubSpot.

1. Enter the 6-digit verification code and click **Next**.

1. Enter a password and click **Next**.

1. Enter your first name and last name and click **Next**, or sign up using the **Sign up with Google** link.

1. Enter your industry and click **Next**.

1. Enter your job role and click **Next**.

1. Enter your company name and click **Next**.

1. Select the size of your company (number of employees working in your company) and click **Next**.

1. Enter your company website and click **Next**.

1. Select where your data should be hosted (United States or Europe) and click **Create Account**.

1. Select the purpose of your account creation and click **Next**.

1. Choose **Connect Google Account** or choose to add contacts yourself to link your contacts with your HubSpot account.

1. Log in to your Google account if you chose the **Connect Google Account** option to link your contacts and start using your HubSpot account.

## Creating a HubSpot developer app


App developer accounts are intended for creating and managing apps, integrations, and developer test accounts. They're also where you can create and manage App Marketplace listings. However, app developer accounts and their associated test accounts aren’t connected to a standard HubSpot account. They can't sync data or assets to or from another HubSpot account. To get the Client ID and Client Secret you create a developers account.

1. Go to https://developers.hubspot.com/

1. Choose **Create developer account** and scroll down.

1. You will be asked whether you want to create an App developers account, Private App account, or CMS Developer Sandbox account. Choose **Create App developers account**.

1. Since you already created an Account with HubSpot, you can choose **Continue with this user**.

1. Click **Start signup**.

1. Enter your Job Role and click **Next**.

1. Name your developer account and click **Next**, then **Skip**.

1. Choose **Create App**.

1. Once your App is created, choose **Auth**.

1. Under Auth, note the Client ID and Client Secret.

1. Add your region specific **Redirect URL** as https:*//<aws-region>*.console.aws.amazon.com/gluestudio/oauth. For example, add https://us-east-1.console.aws.amazon.com/gluestudio/oauth for the us-east-1 region.

1. Scroll down and find scopes. There are two kinds of scopes you must select under headings "CRM" and "Standard".

1. Add the following scopes:

   ```
   content
   automation
   oauth
   crm.objects.owners.read
   forms
   tickets
   crm.objects.contacts.write
   e-commerce
   crm.schemas.custom.read
   crm.objects.custom.read
   sales-email-read
   crm.objects.custom.write
   crm.objects.companies.write
   crm.lists.write
   crm.objects.companies.read
   crm.lists.read
   crm.objects.deals.read
   crm.objects.deals.write
   crm.objects.contacts.read
   ```

1. Click **Save** and your dev account is now ready to use.

1. Scroll above to find the **Client ID**.

1. On the same page, click **Show** to get the **Client secret**.

## Creating a HubSpot developer test account


Within app developer accounts, you can create developer test accounts to test apps and integrations without affecting any real HubSpot data. Developer test accounts do not mirror production accounts, but rather have access to a 90-day trial of the Enterprise versions of Marketing, Sales, Service, CMS, and Operations Hub, providing the ability to test most HubSpot tools and APIs.

1. Click **Home**.

1. Click **Create test account**.

1. Click **Create App Test Account**.

1. A new window appears. Enter the app test account name and click **Create**.

   Your app test account is now created.

**Note**  
The developer account is related to development activities such as API integration, and the app test account is used to see the data which is created or pulled by the developer account.

# Configuring HubSpot connections


HubSpot supports the AUTHORIZATION\$1CODE grant type for OAuth2.
+ This grant type is considered "three-legged" OAuth as it relies on redirecting users to a third-party authorization server to authenticate the user. It is used when creating connections via the AWS Glue console. The user creating a connection needs to provide the OAuth related information like Client ID and Client Secret for their HubSpot client application. The AWS Glue console will redirect the user to HubSpot where the user must login and allow AWS Glue the requested permissions to access their HubSpot instance.
+ Users may still opt to create their own connected app in HubSpot and provide their own client id and client secret when creating connections through the AWS Glue console. In this scenario, they will still be redirected to HubSpot to login and authorize AWS Glue to access their resources.
+ This grant type results in a refresh token and access token. The access token is short lived, and may be refreshed automatically without user interaction using the refresh token.
+ For public HubSpot documentation on creating a connected app for Authorization Code OAuth flow, see [Public apps](https://developers.hubspot.com/docs/api/creating-an-app).

To configure a HubSpot connection:

1. In AWS Secrets Manager, create a secret with the following details:

   1. For the customer managed connected app, the Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key.

   1. Note: You must create a secret for the connection in AWS Glue.

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Connection type**, select HubSpot.

   1. Provide the HubSpot environment.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

1. In your AWS Glue job configuration, provide `connectionName` as an **Additional network connection**.

# Reading from HubSpot entities


**Prerequisite**

A HubSpot object you would like to read from. You will need the object name such as contact or task. The following table shows the supported entities for Sync source.

## Supported entities for Sync source



| Entity | API version | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partioning | 
| --- | --- | --- | --- | --- | --- | --- | 
| Campaigns | v1 | No | Yes | No | Yes | No | 
| Companies | v3 | Yes | Yes | Yes | Yes | Yes | 
| Contacts | v3 | Yes | Yes | Yes | Yes | Yes | 
| Contact Lists | v1 | No | Yes | No | Yes | No | 
| Deals | v3 | Yes | Yes | Yes | Yes | Yes | 
| CRM Pipeline (Deal Pipelines) | v1 | No | No | No | Yes | No | 
| Email Events | v1 | No | Yes | No | Yes | No | 
| Calls | v3 | Yes | Yes | Yes | Yes | Yes | 
| Notes | v3 | Yes | Yes | Yes | Yes | Yes | 
| Emails | v3 | Yes | Yes | Yes | Yes | Yes | 
| Meetings | v3 | Yes | Yes | Yes | Yes | Yes | 
| Tasks | v3 | Yes | Yes | Yes | Yes | Yes | 
| Postal Mails | v3 | Yes | Yes | Yes | Yes | Yes | 
| Custom Objects | v3 | Yes | Yes | Yes | Yes | Yes | 
| Forms | v2 | No | No | No | Yes | No | 
| Owners | v3 | No | Yes | No | Yes | No | 
| Products | v3 | Yes | Yes | Yes | Yes | Yes | 
| Tickets | v3 | Yes | Yes | Yes | Yes | Yes | 
| Workflows | v3 | No | No | No | Yes | No | 
| Associations | v4 | Yes | No | No | Yes | No | 
| Associations Labels | v4 | No | No | No | Yes | No | 

**Example**:

```
hubspot_read = glueContext.create_dynamic_frame.from_options(
    connection_type="hubspot",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "contact",
        "API_VERSION": "v3"
    }
```

## Supported entities for Async source



| Entity | API version | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partioning | 
| --- | --- | --- | --- | --- | --- | --- | 
| Companies | v3 | Yes | No | Yes | Yes | No | 
| Contacts | v3 | Yes | No | Yes | Yes | No | 
| Deals | v3 | Yes | No | Yes | Yes | No | 
| Calls | v3 | Yes | No | Yes | Yes | No | 
| Notes | v3 | Yes | No | Yes | Yes | No | 
| Emails | v3 | Yes | No | Yes | Yes | No | 
| Meetings | v3 | Yes | No | Yes | Yes | No | 
| Tasks | v3 | Yes | No | Yes | Yes | No | 
| Postal Mails | v3 | Yes | No | Yes | Yes | No | 
| Custom Objects | v3 | Yes | No | Yes | Yes | No | 
| Products | v3 | Yes | No | Yes | Yes | No | 
| Tickets | v3 | Yes | No | Yes | Yes | No | 

**Example**:

```
hubspot_read = glueContext.create_dynamic_frame.from_options(
    connection_type="hubspot",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "contact",
        "API_VERSION": "v3",
        "TRANSFER_MODE": "ASYNC"
    }
```

**HubSpot entity and field details**:

**HubSpot API v4**: 

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/hubspot-reading-from-entities.html)

**Note**  
For the `Associations` object, to fetch associations between two objects, you need to provide the 'from Id' (the ID of the first object) via a mandatory filter while creating an AWS Glue job. If you want to fetch associations for multiple from IDs in that case, you have to provide multiple IDs in the `where` clause. For example: for fetching `Associations` for contact IDs '1' and '151', you need to provide a filter as `where id=1 AND id=151`.

**HubSpot API v3**:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/hubspot-reading-from-entities.html)

For the following entities, HubSpot provides endpoints to fetch metadata dynamically, so that operator support is captured at the datatype level for each entity.

**Note**  
`DML_STATUS` is a virtual field added on every record at runtime to determine its status (CREATED/UPDATED) in the Sync mode. The `CONTAINS/LIKE` operator is not supported in the Async mode.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/hubspot-reading-from-entities.html)

**HubSpot API v2**:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/hubspot-reading-from-entities.html)

**HubSpot API v1**:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/hubspot-reading-from-entities.html)

## Partitioning queries


You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an **inclusive** lower bound value of the chosen partition field.

  For the DateTime field, we accept the value in ISO format.

  Examples of valid value:

  ```
  “2024-01-01T10:00:00.115Z" 
  ```
+ `UPPER_BOUND`: an **exclusive** upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`: the number of partitions.

The following table describes the entity partitioning field support details:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/hubspot-reading-from-entities.html)

Example:

```
hubspot_read = glueContext.create_dynamic_frame.from_options(
    connection_type="hubspot",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "company",
        "API_VERSION": "v3",
        "PARTITION_FIELD": "hs_object_id"
        "LOWER_BOUND": "50"
        "UPPER_BOUND": "16726619290"
        "NUM_PARTITIONS": "10"
    }
```

# Writing to HubSpot Entities


## Prerequisites

+ A HubSpot object you would like to write to. You will need the object name such as contact or ticket.
+ The HubSpot connector supports following write operations:
  + INSERT
  + UPDATE
+ When using the `UPDATE` write operation, the `ID_FIELD_NAMES` option must be provided to specify the ID field for the records.

## Supported entities for Sync Destination



| Entity | API Version | Will be supported as Destination Connector | Can be Inserted | Can be Updated | 
| --- | --- | --- | --- | --- | 
| Companies | v3 | Yes | Yes (Single, Bulk) | Yes (Single, Bulk) | 
| Contacts | v3 | Yes | Yes (Single, Bulk) | Yes (Single, Bulk) | 
| Deals | v3 | Yes | Yes (Single, Bulk) | Yes (Single, Bulk) | 
| Products | v3 | Yes | Yes (Single, Bulk) | Yes (Single, Bulk) | 
| Calls | v3 | Yes | Yes (Single, Bulk) | Yes (Single, Bulk) | 
| Meetings | v3 | Yes | Yes (Single, Bulk) | Yes (Single, Bulk) | 
| Notes | v3 | Yes | Yes (Single, Bulk) | Yes (Single, Bulk) | 
| Emails | v3 | Yes | Yes (Single, Bulk) | Yes (Single, Bulk) | 
| Tasks | v3 | Yes | Yes (Single, Bulk) | Yes (Single, Bulk) | 
| Postal Mails | v3 | Yes | Yes (Single, Bulk) | Yes (Single, Bulk) | 
| Custom Objects | v3 | Yes | Yes (Single, Bulk) | Yes (Single, Bulk) | 
| Tickets | v3 | Yes | Yes (Single, Bulk) | Yes (Single, Bulk) | 
| Associations | v4 | Yes | Yes (Single, Bulk) | No | 
| Associations Labels | v4 | Yes | Yes (Single, Bulk) | Yes (Single, Bulk) | 

**Examples**:

**INSERT Operation**

```
hubspot_write = glueContext.write_dynamic_frame.from_options(
    frame=frameToWrite,
    connection_type="hubspot",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "contact",
        "API_VERSION": "v3",
        "WRITE_OPERATION": "INSERT"
    }
)
```

**UPDATE Operation**

```
hubspot_write = glueContext.write_dynamic_frame.from_options(
    frame=frameToWrite,
    connection_type="hubspot",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "deal",
        "API_VERSION": "v3",
        "WRITE_OPERATION": "UPDATE",
        "ID_FIELD_NAMES": "hs_object_id"
    }
)
```

# HubSpot connection options


The following are connection options for HubSpot:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in HubSpot.
+ `API_VERSION`(String) - (Required) Used for Read. HubSpot Rest API version you want to use. For example: v1,v2,v3,v4.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query.
+ `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field.
+ `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field. 
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read.
+ `TRANSFER_MODE`(String) - Used to indicate whether the query should be run on Async mode.
+ `WRITE_OPERATION`(String) - Default: INSERT. Used for write. Value should be INSERT or UPDATE.
+ `ID_FIELD_NAMES`(String) - Default : null. Required for UPDATE.

# Limitations and notes for HubSpot connector


The following are limitations or notes for the HubSpot connector:
+ The search endpoints are limited to 10,000 total results for any given query. Any partition having more than 10,000 records will result in a 400 error.
+ Other important limitations for the connector are described in [Limitations](https://developers.hubspot.com/docs/api/crm/search#limitations).
+ A maximum of three filtering statements are accepted by HubSpot.
+ Currently, HubSpot supports Associations between standard HubSpot objects (e.g. contact, company, deal, or ticket) and custom objects.
  + For a Free account: you can create only up to 10 association types between each object pairing (e.g. contacts and companies).
  + For a Super Admin account: you can create only up to 50 association types between each object pairing.
  + For more information, see [Associations v4](https://developers.hubspot.com/docs/api/crm/) and [Create and use association labels](https://knowledge.hubspot.com/object-settings/create-and-use-association-labels).
+ The 'Quote' and 'Communications' objects are not present for Associations as they are currently not supported in the connector.
+ For Async, SaaS sorts the values in ascending order only.
+ For the `Ticket` entity, SaaS doesn't return the `hs_object_id` field in the Async mode.

# Connecting to Instagram Ads
Connecting to Instagram Ads

Instagram is a popular photo-sharing app that lets you connect with brands, celebrities, thought leaders, friends, family, and more. It is a photo-sharing and social networking service. Users can take photos or short videos and share them with their followers. Instagram ads are posts for which businesses can pay to serve to Instagram users.

**Topics**
+ [

# AWS Glue support for Instagram Ads
](instagram-ads-support.md)
+ [

# Policies containing the API operations for creating and using connections
](instagram-ads-configuring-iam-permissions.md)
+ [

# Configuring Instagram Ads
](instagram-ads-configuring.md)
+ [

# Configuring Instagram Ads connections
](instagram-ads-configuring-connections.md)
+ [

# Reading from Instagram Ads entities
](instagram-ads-reading-from-entities.md)
+ [

# Instagram Ads connection options
](instagram-ads-connection-options.md)
+ [

# Limitations and notes for Instagram Ads connector
](instagram-ads-connector-limitations.md)

# AWS Glue support for Instagram Ads


AWS Glue supports Instagram Ads as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Instagram Ads.

**Supported as a target?**  
No.

**Supported Instagram Ads API versions**  
The following Instagram Ads API versions are supported:
+ v17.0
+ v18.0
+ v19.0
+ v20.0

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring Instagram Ads


Before you can use AWS Glue to transfer data from Instagram Ads, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ Instagram Standard accounts are accessed indirectly through Facebook.
+ User authentication is needed to generate the access token.
+ The Instagram Ads SDK connector will be implementing the *User Access Token OAuth* flow.
+ We are using OAuth2.0 to authenticate our API requests to Instagram Ads. This web-based authentication falls under the Multi-Factor Authentication (MFA) architecture, which is a superset of 2FA.
+ The user needs to grant permissions to access the end points. For accessing the user's data, endpoint authorization is handled through [permissions](https://developers.facebook.com/docs/permissions) and [features](https://developers.facebook.com/docs/features-reference).

## Getting OAuth 2.0 credentials


To obtain API credentials so that you can make authenticated calls to your instance, see [Graph API](https://developers.facebook.com/docs/graph-api/).

# Configuring Instagram Ads connections


Instagram Ads supports the AUTHORIZATION\$1CODE grant type for OAuth2.
+ This grant type is considered three-legged OAuth as it relies on redirecting users to the third-party authorization server to authenticate the user. It is used when creating connections via the AWS Glue console.
+ Users may opt to create their own connected app in Instagram Ads and provide their own client ID and client secret when creating connections through the AWS Glue console. In this scenario, they will still be redirected to Instagram Ads to login and authorize AWS Glue to access their resources.
+ This grant type results in an access token. An expiring system user token is valid for 60 days from a generated or refreshed date. To create continuity, the developer should refresh the access token within 60 days. Failing to do so results in forfeiting the access token and requires the developer obtain a new one to regain API access. See [Refresh Access Token](https://developers.facebook.com/docs/marketing-api/system-users/install-apps-and-generate-tokens/).

To configure a Instagram Ads connection:

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Connection type**, select Instagram Ads.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for the following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Provide the User Managed Client Application Client ID.

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. The selected secret needs to have a key `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` with the value being the Client Secret from the connected app.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

# Reading from Instagram Ads entities


**Prerequisite**

A Instagram Ads object you would like to read from. You will need the object name. The following tables shows the supported entities.

**Supported entities for source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Campaign | Yes | Yes | No | Yes | Yes | 
| Ad Set | Yes | Yes | No | Yes | Yes | 
| Ads | Yes | Yes | No | Yes | Yes | 
| Ad Creative | No | Yes | No | Yes | No | 
| Insights - Account | No | Yes | No | Yes | No | 
| Ad Image | Yes | Yes | No | Yes | No | 
| Insights - Ad | Yes | Yes | No | Yes | Yes | 
| Insights - AdSet | Yes | Yes | No | Yes | Yes | 
| Insights - Campaign | Yes | Yes | No | Yes | Yes | 

**Example**:

```
instagramAds_read = glueContext.create_dynamic_frame.from_options(
    connection_type="instagramads",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "API_VERSION": "v20.0"
    }
```

## Instagram Ads entity and field details


For more information about the entities and field details see:
+ [Campaign](https://developers.facebook.com/docs/marketing-api/reference/ad-campaign-group)
+ [Ad Set](https://developers.facebook.com/docs/marketing-api/reference/ad-campaign)
+ [Ad](https://developers.facebook.com/docs/marketing-api/reference/adgroup)
+ [Ad Creative](https://developers.facebook.com/docs/marketing-api/reference/ad-creative)
+ [Ad Account Insight](https://developers.facebook.com/docs/marketing-api/reference/ad-account/insights)
+ [Ad Image](https://developers.facebook.com/docs/marketing-api/reference/ad-image)
+ [Ad Insights](https://developers.facebook.com/docs/marketing-api/reference/adgroup/insights/)
+ [AdSets Insights](https://developers.facebook.com/docs/marketing-api/reference/ad-campaign/insights)
+ [Campaigns Insights](https://developers.facebook.com/docs/marketing-api/reference/ad-campaign-group/insights)

For more information, see [Marketing API](https://developers.facebook.com/docs/marketing-api/reference/v21.0).

**Note**  
Struct and List data types are converted to String data type in the response of the connectors.

## Partitioning queries


You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an **inclusive** lower bound value of the chosen partition field.

  For the DateTime field, we accept the Spark timestamp format used in Spark SQL queries.

  Example of valid value:

  ```
  "2022-01-01T00:00:00.000Z"
  ```
+ `UPPER_BOUND`: an **exclusive** upper bound value of the chosen partition field.

  Example of valid value:

  ```
  "2024-01-02T00:00:00.000Z"
  ```
+ `NUM_PARTITIONS`: the number of partitions.

Example:

```
instagramADs_read = glueContext.create_dynamic_frame.from_options(
    connection_type="instagramads",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "API_VERSION": "v20.0",
        "PARTITION_FIELD": "created_time"
        "LOWER_BOUND": "2022-01-01T00:00:00.000Z"
        "UPPER_BOUND": "2024-01-02T00:00:00.000Z"
        "NUM_PARTITIONS": "10"
    }
```

# Instagram Ads connection options


The following are connection options for Instagram Ads:
+ `ENTITY_NAME`(String) - (Required) Used for read. The name of your object in Instagram Ads.
+ `API_VERSION`(String) - (Required) Used for read. Instagram Ads Graph API version you want to use. For example: v21.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for read. Full Spark SQL query.
+ `PARTITION_FIELD`(String) - Used for read. Field to be used to partition query.
+ `LOWER_BOUND`(String)- Used for read. An inclusive lower bound value of the chosen partition field.
+ `UPPER_BOUND`(String) - Used for read. An exclusive upper bound value of the chosen partition field. 
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for read. Number of partitions for read.

# Limitations and notes for Instagram Ads connector


The following are limitations or notes for the Instagram Ads connector:
+ An app's call count is the number of calls a user can make during a rolling one-hour window 200 multiplied by the number of users. For rate limit details, see [Rate Limits](https://developers.facebook.com/docs/graph-api/overview/rate-limiting/), and [Business Use Case Rate Limits](https://developers.facebook.com/docs/graph-api/overview/rate-limiting/#buc-rate-limits).

# Connecting to Intercom in AWS Glue Studio
Connecting to Intercom

 Intercom is the Engagement OS, an open channel between your business and your customers—in product, in the moment, and on their terms—creating an ongoing dialogue that enables you to make the most of every engagement across the customer journey. 

**Topics**
+ [

# AWS Glue support for Intercom
](intercom-support.md)
+ [

# Policies containing the API operations for creating and using connections
](intercom-configuring-iam-permissions.md)
+ [

# Configuring Intercom
](intercom-configuring.md)
+ [

# Configuring Intercom connections
](intercom-configuring-connections.md)
+ [

# Reading from Intercom entities
](intercom-reading-from-entities.md)
+ [

# Intercom connection options
](intercom-connection-options.md)
+ [

# Limitations
](intercom-limitations.md)
+ [

# Creating a new Intercom account and configuring the client app
](intercom-new-account-creation.md)

# AWS Glue support for Intercom


AWS Glue supports Intercom as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Intercom.

**Supported as a target?**  
No.

**Supported Intercom API versions**  
 v2.5. For entity support per version specific, see [Reading from Intercom entities](intercom-reading-from-entities.md). 

# Policies containing the API operations for creating and using connections
IAM policies

 The following sample policy describes the required IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

You can also use the following managed IAM policies to allow access:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, Amazon CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring Intercom


Before you can use AWS Glue to transfer from Intercom, you must meet these requirements:

## Minimum requirements

+  You have a Intercom account. For more information, see [Creating a new Intercom account and configuring the client app](intercom-new-account-creation.md). 
+  Your Intercom account is enabled for API access. 
+  You should have an app under Intercom developer account that provides the client credentials that AWS Glue uses to access your data securely when it makes authenticated calls to your account. For more information, see Intercom- New Account and Client App Creation Steps. 

 If you meet these requirements, you’re ready to connect AWS Glue to your Intercom account. 

# Configuring Intercom connections


 Intercom supports the `AUTHORIZATION_CODE` grant type for OAuth 2. 

 This grant type is considered “three-legged” OAuth as it relies on redirecting users to the third party authorization server to authenticate the user. It is used when creating connections via the AWS Glue Console. The AWS Glue Console will redirect the user to Google Ads where the user must login and allow AWS Glue the requested permissions to access their Intercom instance. 

 Users should provide their own client id and client secret when creating connections through the AWS Glue Console. In this scenario, they will still be redirected to Intercom to login and authorize AWS Glue to access their resources. 

 This grant type results in a refresh token and access token. The access token is short lived, and may be refreshed automatically without user interaction using the refresh token. 

 For more information on creating a connected app for Authorization Code OAuth flow, see [ Ads API ](https://developers.intercom.com/building-apps/docs/setting-up-oauth). 

To configure an Intercom connection:

1.  In AWS Secrets Manager, create a secret with the following details. It is required to create a secret for each connection in AWS Glue. 

   1.  For customer managed connected app – Secret should contain the connected app access token, refresh token, client\$1id, and client\$1secret. 

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below: 

   1. When selecting a **Connection type**, select Intercom.

   1. Provide the Intercom environment.

   1.  Select the IAM role which AWS Glue can assume and has permissions for following actions: 

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1.  Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. 

   1.  Select the network options if you want to use your network. 

1.  Grant the IAM role associated with your AWS Glue job permission to read `secretName`. 

# Reading from Intercom entities


 **Prerequisites** 
+  An Intercom object you would like to read from. Refer to the supported entities table below to check the available entities. 

 **Supported entities** 


| Entity | API\$1Version | Can be Filtered | Supports Limit | Supports Order By | Supports Select \$1 | Supports Partitioning | 
| --- | --- | --- | --- | --- | --- | --- | 
| Admins | v2.5 | No | No | No | Yes | No | 
| Companies | v2.5 | No | Yes | No | Yes | No | 
| Conversations | v2.5 | Yes | Yes | Yes | Yes | Yes | 
| Data Attributes | v2.5 | No | No | No | Yes | No | 
| Contacts | v2.5 | Yes | Yes | Yes | Yes | Yes | 
| Segments | v2.5 | No | No | No | Yes | No | 
| Tags | v2.5 | No | No | No | Yes | No | 
| Teams | v2.5 | No | No | No | Yes | No | 

 **Example** 

```
Intercom_read = glueContext.create_dynamic_frame.from_options(
    connection_type="Intercom",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "company",
        "API_VERSION": "V2.5"
    }
)
```

 **Intercom entity and field details** 


| Entity | Field | Data Type | Supported Operators | 
| --- | --- | --- | --- | 
| Admins | type | String | NA | 
| Admins | id | String | NA | 
| Admins | avatar | Struct | NA | 
| Admins | name | String | NA | 
| Admins | email | String | NA | 
| Admins | away\$1mode\$1enabled | Boolean | NA | 
| Admins | away\$1mode\$1reassign | Boolean | NA | 
| Admins | has\$1inbox\$1seat | Boolean | NA | 
| Admins | teams\$1ids | List | NA | 
| Admins | job\$1title | String | NA | 
| Companies | type | String | NA | 
| Companies | id | String | NA | 
| Companies | app\$1id | String | NA | 
| Companies | created\$1at | DateTime | NA | 
| Companies | remote\$1created\$1at | DateTime | NA | 
| Companies | updated\$1at | DateTime | NA | 
| Companies | last\$1request\$1at | DateTime | NA | 
| Companies | plan | Struct | NA | 
| Companies | company\$1id | String | NA | 
| Companies | name | String | NA | 
| Companies | custom\$1attributes | Struct | NA | 
| Companies | session\$1count | Integer | NA | 
| Companies | monthly\$1spend | Integer | NA | 
| Companies | user\$1count | Integer | NA | 
| Companies | industry | String | NA | 
| Companies | size | Integer | NA | 
| Companies | website | String | NA | 
| Companies | tags | Struct | NA | 
| Companies | segments | Struct | NA | 
| Contacts | id | String | EQUAL\$1TO.NOT\$1EQUAL\$1TO | 
| Contacts | type | String | NA | 
| Contacts | workspace\$1id | String | NA | 
| Contacts | external\$1id | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Contacts | role | String | EQUAL\$1TO.NOT\$1EQUAL\$1TO | 
| Contacts | email | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Contacts | phone | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Contacts | name | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Contacts | avatar | String | NA | 
| Contacts | owner\$1id | Integer | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Contacts | social\$1profiles | Struct | NA | 
| Contacts | has\$1hard\$1bounced | Boolean | EQUAL\$1TO | 
| Contacts | marked\$1email\$1as\$1spam | Boolean | EQUAL\$1TO | 
| Contacts | unsubscribed\$1from\$1emails | Boolean | EQUAL\$1TO | 
| Contacts | created\$1at | DateTime | EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Contacts | updated\$1at | DateTime | EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Contacts | signed\$1up\$1at | DateTime | EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Contacts | last\$1seen\$1at | DateTime | EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Contacts | last\$1replied\$1at | DateTime | EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Contacts | last\$1contacted\$1at | DateTime | EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Contacts | last\$1email\$1opened\$1at | DateTime | EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Contacts | last\$1email\$1clicked\$1at | DateTime | EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Contacts | language\$1override | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Contacts | browser | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Contacts | browser\$1version | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Contacts | browser\$1language | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Contacts | os | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Contacts | location | Struct | NA | 
| Contacts | location\$1country | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Contacts | location\$1region | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Contacts | location\$1city | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Contacts | android\$1app\$1name | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Contacts | android\$1app\$1version | String | NA | 
| Contacts | android\$1device | String | NA | 
| Contacts | android\$1os\$1version | String | NA | 
| Contacts | android\$1sdk\$1version | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Contacts | android\$1last\$1seen\$1at | Date | NA | 
| Contacts | ios\$1app\$1name | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Contacts | ios\$1app\$1version | String | NA | 
| Contacts | ios\$1device | String | NA | 
| Contacts | ios\$1os\$1version | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Contacts | ios\$1sdk\$1version | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Contacts | ios\$1last\$1seen\$1at | DateTime | NA | 
| Contacts | custom\$1attributes | Struct | NA | 
| Contacts | tags | Struct | NA | 
| Contacts | notes | Struct | NA | 
| Contacts | companies | Struct | NA | 
| Contacts | unsubscribed\$1from\$1sms | Boolean | NA | 
| Contacts | sms\$1consent | Boolean | NA | 
| Contacts | opted\$1out\$1subscription\$1types | Struct | NA | 
| Contacts | referrer | String | NA | 
| Contacts | utm\$1campaign | String | NA | 
| Contacts | utm\$1content | String | NA | 
| Contacts | utm\$1medium | String | NA | 
| Contacts | utm\$1source | String | NA | 
| Contacts | utm\$1term | String | NA | 
| Conversations | type | String | NA | 
| Conversations | id | Integer | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | created\$1at | DateTime | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | updated\$1at | DateTime | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | source | Struct | NA | 
| Conversations | source\$1id | String | EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Conversations | source\$1type | String | EQUAL\$1TO, NOT\$1EQUAL\$1TO, | 
| Conversations | source\$1delivered\$1as | String | EQUAL\$1TO, NOT\$1EQUAL\$1TO, | 
| Conversations | source\$1subject | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Conversations | source\$1body | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Conversations | source\$1author\$1id | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Conversations | source\$1author\$1type | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Conversations | source\$1author\$1name | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Conversations | source\$1author\$1email | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Conversations | source\$1url | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Conversations | contacts | Struct | NA | 
| Conversations | teammates | Struct | NA | 
| Conversations | title | String | NA | 
| Conversations | admin\$1assignee\$1id | Integer | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | team\$1assignee\$1id | Integer | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Conversations | custom\$1attributes | Struct | NA | 
| Conversations | open | Boolean | EQUAL\$1TO | 
| Conversations | state | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Conversations | read | Boolean | EQUAL\$1TO | 
| Conversations | waiting\$1since | DateTime | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | snoozed\$1until | DateTime | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | tags | Struct | NA | 
| Conversations | first\$1contact\$1reply | Struct | NA | 
| Conversations | priority | String | EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Conversations | topics | Struct | NA | 
| Conversations | sla\$1applied | Struct | NA | 
| Conversations | conversation\$1rating | Struct | NA | 
| Conversations | conversation\$1rating\$1requested\$1at | DateTime | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | conversation\$1rating\$1replied\$1at | DateTime | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | conversation\$1rating\$1score | Integer | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | conversation\$1rating\$1remark | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Conversations | conversation\$1rating\$1contact\$1id | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Conversations | conversation\$1rating\$1admin\$1id | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Conversations | statistics | Struct | NA | 
| Conversations | statistics\$1time\$1to\$1assignment | Integer | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | statistics\$1time\$1to\$1admin\$1reply | Integer | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | statistics\$1time\$1to\$1first\$1close | Integer | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | statistics\$1time\$1to\$1last\$1close | Integer | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | statistics\$1median\$1time\$1to\$1reply | Integer | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | statistics\$1first\$1contact\$1reply\$1at | DateTime | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | statistics\$1first\$1assignment\$1at | DateTime | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | statistics\$1first\$1admin\$1reply\$1at | DateTime | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | statistics\$1first\$1close\$1at | DateTime | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | statistics\$1last\$1assignment\$1at | DateTime | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | statistics\$1last\$1assignment\$1admin\$1reply\$1at | DateTime | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | statistics\$1last\$1contact\$1reply\$1at | DateTime | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | statistics\$1last\$1admin\$1reply\$1at | DateTime | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | statistics\$1last\$1close\$1at | DateTime | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | statistics\$1last\$1closed\$1by\$1id | String | CONTAINS, EQUAL\$1TO, NOT\$1EQUAL\$1TO | 
| Conversations | statistics\$1count\$1reopens | Integer | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | statistics\$1count\$1assignments | Integer | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | statistics\$1count\$1conversation\$1parts | Integer | EQUAL\$1TO, NOT\$1EQUAL\$1TO, GREATER\$1THAN, LESS\$1THAN | 
| Conversations | conversation\$1parts | List | NA | 
| Data Attributes | id | Integer | NA | 
| Data Attributes | type | String | NA | 
| Data Attributes | model | String | NA | 
| Data Attributes | name | String | NA | 
| Data Attributes | full\$1name | String | NA | 
| Data Attributes | label | String | NA | 
| Data Attributes | description | String | NA | 
| Data Attributes | data\$1type | String | NA | 
| Data Attributes | options | List | NA | 
| Data Attributes | api\$1writable | Boolean | NA | 
| Data Attributes | ui\$1writable | Boolean | NA | 
| Data Attributes | custom | Boolean | NA | 
| Data Attributes | archived | Boolean | NA | 
| Data Attributes | created\$1at | Boolean | NA | 
| Data Attributes | updated\$1at | DateTime | NA | 
| Data Attributes | admin\$1id | String | NA | 
| Segments | type | String | NA | 
| Segments | id | String | NA | 
| Segments | name | String | NA | 
| Segments | created\$1at | DateTime | NA | 
| Segments | updated\$1at | DateTime | NA | 
| Segments | person\$1type | String | NA | 
| Segments | count | Integer | NA | 
| Tags | type | String | NA | 
| Tags | id | String | NA | 
| Tags | name | String | NA | 
| Teams | type | String | NA | 
| Teams | id | String | NA | 
| Teams | name | String | NA | 
| Teams | admin\$1ids | List | NA | 

 **Partitioning queries** 

 Additional spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, `NUM_PARTITIONS` can be provided if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by spark tasks concurrently. 
+  `PARTITION_FIELD`: the name of the field to be used to partition query. 
+  `LOWER_BOUND`: an inclusive lower bound value of the chosen partition field. 

   For date, we accept the Spark date format used in Spark SQL queries. Example of valid values: `"2024-02-06"`. 
+  `UPPER_BOUND`: an exclusive upper bound value of the chosen partition field. 
+  `NUM_PARTITIONS`: number of partitions. 

 Entity-wise partitioning field support details are captured in the following table. 


| Entity Name | Partitioning Field | Data Type | 
| --- | --- | --- | 
| Contacts | created\$1at, updated\$1at,last\$1seen\$1at | DateTime | 
| Conversations | id | Integer | 
| Conversations | created\$1at, updated\$1at | DateTime | 

 **Example** 

```
Intercom_read = glueContext.create_dynamic_frame.from_options(
    connection_type="Intercom",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "conversation",
        "API_VERSION": "V2.5",
        "PARTITION_FIELD": "created_at"
        "LOWER_BOUND": "2022-07-13T07:55:27.065Z"
        "UPPER_BOUND": "2022-08-12T07:55:27.065Z"
        "NUM_PARTITIONS": "2"
    }
)
```

# Intercom connection options


The following are connection options for Intercom:
+  `ENTITY_NAME`(String) - (Required) Used for Read. The name of your Object in Intercom. 
+  `API_VERSION`(String) - (Required) Used for Read. Intercom Rest API version you want to use. Example: v2.5. 
+  `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object. 
+  `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format. 
+  `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query. 
+  `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query. 
+  `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field. 
+  `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field. 
+  `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read. 
+  `INSTANCE_URL`(String) - URL of the instance where the user wants to run the operations. For example: [https://api.intercom.io](https://api.intercom.io). 

# Limitations


The following are limitations for the Intercom connector:
+  When using the Company entity, there is a limit of 10,000 Companies that can be returned. For more information, see [ List all companies API](https://developers.intercom.com/docs/references/2.5/rest-api/companies/list-companies). 
+  While applying order by, filter is mandatory for both **Contact** and **Conversation** entities. 
+  MCA is supported by the SaaS provider. However, based on the API rate limits mentioned in the documentation, we will not host MCA on AWS Glue as it may impact other workloads and potentially cause performance issues due to resource contention. 

# Creating a new Intercom account and configuring the client app


**Creating a Intercom account**

1. Choose on the [Intercom URL](https://app.intercom.com/) and choose **Start my free trial** on right upper corner of the page.

1. Choose **Try for free button** on right upper corner of the page.

1. Choose the business type you require. 

1. Enter all the information required on the page.

1. After entering all the information, choose **Register**.



**Creating an Intercom developer app**

To get the **Client Id** and **Client Secret**, you create a developer account.

1. Navigate to [https://app.intercom.com/](https://app.intercom.com/).

1. Enter the Email ID and Password/ Sign In Using Google and log in.

1. Choose **user profile** on the left bottom corner and choose settings.

1. Choose **Apps & Integration**.

1. Choose the **Developer Hub** tab under **Apps & Integration**.

1. Choose **New app** and create the app here.

1. Provide the app name and choose **Create** app.

1. Inside the app, navigate to the **Authentication** section.

1. Choose the **edit** and add redirect URIs. Add the your region-specific Redirect URL as `https://<aws-region>.console.aws.amazon.com/gluestudio/oauth`. For example, add `https://us-east-1.console.aws.amazon.com/gluestudio/oauth for the us-east-1 region`.

1. Get the generated **Client Id** and **Client Secret** in the Basic Information Section.

# Connecting to Jira Cloud
Connecting to Jira Cloud

Jira Cloud is a platform developed by Atlassian. The platform includes issue tracking products that help teams plan and track their Agile projects. As a Jira Cloud user, your account contains data about your projects, such as issues, workflows, and events. You can use AWS Glue to transfer your Jira Cloud data to certain AWS services or other supported applications.

**Topics**
+ [

# AWS Glue support for Jira Cloud
](jira-cloud-support.md)
+ [

# Policies containing the API operations for creating and using connections
](jira-cloud-configuring-iam-permissions.md)
+ [

# Configuring Jira Cloud
](jira-cloud-configuring.md)
+ [

# Configuring Jira Cloud connections
](jira-cloud-configuring-connections.md)
+ [

# Reading from Jira Cloud entities
](jira-cloud-reading-from-entities.md)
+ [

# Jira Cloud connection options
](jira-cloud-connection-options.md)
+ [

# Limitations and notes for Jira Cloud connector
](jira-cloud-connector-limitations.md)

# AWS Glue support for Jira Cloud


AWS Glue supports Jira Cloud as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Jira Cloud.

**Supported as a target?**  
No.

**Supported Jira Cloud API versions**  
The following Jira Cloud API versions are supported:
+ v3

For entity support per version specific, see [Reading from Jira Cloud entities](jira-cloud-reading-from-entities.md). 

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring Jira Cloud


Before you can use AWS Glue to transfer data from Jira Cloud to supported destinations, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have an Atlassian account where you use the Jira software product in Jira Cloud. For more information, see [Creating a Jira Cloud account](#jira-cloud-configuring-creating-jira-cloud-account).
+ You must have an AWS account created with the service access to AWS Glue.
+ This app provides the client credentials that AWS Glue uses to access your data securely when it makes authenticated calls to your account. For more information, see [Enabling OAuth 2.0 (3LO)](https://developer.atlassian.com/cloud/jira/platform/oauth-2-3lo-apps/#enabling-oauth-2-0--3lo-) in the Atlassian Developer documentation.

If you meet these requirements, you’re ready to connect AWS Glue to your Jira Cloud account.

## Creating a Jira Cloud account


To create a Jira Cloud account:

1. Navigate to the [Atlassian sign up URL](https://id.atlassian.com/signup).

1. Enter your work email and name and choose **Agree**. You receive a verification email.

1. After verifying your email, you can create a password and choose **Sign up**.

1. Enter name and password, and choose **Sign up**.

1. You are redirected to a page where you need to enter your site. Enter a site name and choose **Agree**.

Once your Atlassian Cloud site starts up, you can set up your Jira by answering a few questions as per your project type preferences.

To log in to an existing account:

1. Navigate to the [Atlassian login URL](https://id.atlassian.com/login) and enter credentials.

1. Enter email and password and click **Log in**. You are redirected to the Jira dashboard.

## Creating an app in your Jira Cloud


To create an app in Jira Cloud and obtain the Client ID and Client Secret from the managed client app:

1. Navigate to the [Jira Cloud URL](https://id.atlassian.com/login) and enter credentials.

1. Choose **Create** and select the **OAuth 2.0 integration** option.

1. Enter the app name, check **T&C** and choose **Create**.

1. Navigate to the **Distribution** section in the left menu and choose **Edit**.

1. In the **Edit distribution controls** section:

   1. Select **DISTRIBUTION STATUS** as **Sharing**.

   1. Enter the vendor name.

   1. Enter the URL for your **Privacy policy**. For example, https://docs.aws.amazon.com/glue/latest/dg/security-iam-awsmanpol.html

   1. Enter the URL for your **Terms of service** (optional).

   1. Enter the URL for your **Customer support contact** (optional).

   1. Select Yes/No from the **PERSONAL DATA DECLARATION** and choose **Save changes**.

1. Navigate to **Permissions** in the left menu for the respective app.

1. For **Jira API**, choose **Add**. Once added, choose the **Configuration** option.

1. Under the **Classic scopes** > **Jira platform REST API** section choose **Edit Scopes**. and check all scopes. Click **Save**.

1. Under **Granular Scopes** choose **Edit Scopes** and select the following scopes:

1. Scroll down and find scopes. There are two kinds of scopes you must select under headings "CRM" and "Standard".

1. Add the following scopes:

   ```
   read:application-role:jira
   read:audit-log:jira
   read:avatar:jira
   read:field:jira
   read:group:jira
   read:instance-configuration:jira
   read:issue-details:jira
   read:issue-event:jira
   read:issue-link-type:jira
   read:issue-meta:jira
   read:issue-security-level:jira
   read:issue-security-scheme:jira
   read:issue-type-scheme:jira
   read:issue-type-screen-scheme:jira
   read:issue-type:jira
   read:issue.time-tracking:jira
   read:label:jira
   read:notification-scheme:jira
   read:permission:jira
   read:priority:jira
   read:project:jira
   read:project-category:jira
   read:project-role:jira
   read:project-type:jira
   read:project-version:jira
   read:project.component:jir
   read:project.property:jira
   read:resolution:jira
   read:screen:jira
   read:status:jira
   read:user:jira
   read:workflow-scheme:jira
   read:workflow:jira
   read:field-configuration:jira
   read:issue-type-hierarchy:jira
   read:webhook:jira
   ```

1. Navigate to **Authentication** in the left menu and choose **Add**.

1. Enter a **Callback URL** such as https://us-east-1.console.aws.amazon.com/gluestudio/oauth

1. Navigate to **Settings** in the left menu and scroll down for **Authentication** details. Note the Client ID and Secret.

# Configuring Jira Cloud connections


Jira Cloud supports the AUTHORIZATION\$1CODE grant type for OAuth2.
+ This grant type is considered "three-legged" OAuth as it relies on redirecting users to a third-party authorization server to authenticate the user. It is used when creating connections via the AWS Glue console. The AWS Glue console will redirect the user to Jira Cloud where the user must login and allow AWS Glue the requested permissions to access their Jira Cloud instance.
+ Users may still opt to create their own connected app in Jira Cloud and provide their own client ID and client secret when creating connections through the AWS Glue console. In this scenario, they will still be redirected to Jira Cloud to login and authorize AWS Glue to access their resources.
+ This grant type results in a refresh token and access token. The access token is short lived, and may be refreshed automatically without user interaction using the refresh token.
+ For public Jira Cloud documentation on creating a connected app for Authorization Code OAuth flow, see [Enabling OAuth 2.0 (3LO)](https://developer.atlassian.com/cloud/jira/platform/oauth-2-3lo-apps/#enabling-oauth-2-0--3lo-).

To configure a Jira Cloud connection:

1. In AWS Secrets Manager, create a secret with the following details:

   1. For the customer managed connected app, the Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key.

   1. Note: You must create a secret for the connection in AWS Glue.

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Connection type**, select Jira Cloud.

   1. Provide the Jira Cloud environment.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

# Reading from Jira Cloud entities


**Prerequisite**

A Jira Cloud object you would like to read from. You will need the object name such as Audit Record or Issue. The following table shows the supported entities.

**Supported entities for source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Audit Record | Yes | Yes | No | Yes | Yes | 
| Issue | Yes | Yes | No | Yes | Yes | 
| Issue Field | No | No | No | Yes | No | 
| Issue Field Configuration | Yes | Yes | No | Yes | Yes | 
| Issue Link Type | No | No | No | Yes | No | 
| Issue Notification Scheme | Yes | Yes | No | Yes | Yes | 
| Issue Security Scheme | No | No | No | Yes | No | 
| Issue Type Scheme | Yes | Yes | Yes | Yes | Yes | 
| Issue Type Screen Scheme | Yes | Yes | Yes | Yes | Yes | 
| Issue Type | No | No | No | Yes | No | 
| Jira Setting | Yes | No | No | Yes | No | 
| Jira Setting Advanced | No | No | No | Yes | No | 
| Jira Setting Global | No | No | No | Yes | No | 
| Label | No | No | No | Yes | Yes | 
| Myself | Yes | No | No | Yes | No | 
| Permission | No | No | No | Yes | No. | 
| Project | Yes | Yes | Yes | Yes | Yes | 
| Project Category | No | No | No | Yes | No | 
| Project Type | No | No | No | Yes | No | 
| Server Info | No | No | No | Yes | No | 
| Users | No | No | No. | Yes | No | 
| Workflow | Yes | Yes | Yes | Yes | Yes | 
| Workflow Scheme | No | Yes | No | Yes | Yes | 
| Workflow Scheme Project Association | Yes | No | No | Yes | No | 
| Workflow Status | No | No | No | Yes | No | 
| Workflow Status Category | No | No | No | Yes | No | 

**Example**:

```
jiracloud_read = glueContext.create_dynamic_frame.from_options(
    connection_type="JiraCloud",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "audit-record",
        "API_VERSION": "v3"
    }
```

**Jira Cloud entity and field details**:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/jira-cloud-reading-from-entities.html)

## Partitioning queries


You can provide the additional Spark option `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With this parameter, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.
+ `NUM_PARTITIONS`: the number of partitions.

Example:

```
jiraCloud_read = glueContext.create_dynamic_frame.from_options(
    connection_type="JiraCloud",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "issue",
        "API_VERSION": "v3",
        "NUM_PARTITIONS": "10"
    }
```

# Jira Cloud connection options


The following are connection options for Jira Cloud:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in Jira Cloud.
+ `API_VERSION`(String) - (Required) Used for Read. Jira Cloud Rest API version you want to use. For example: v3.
+ `DOMAIN_URL`(String) - (Required) The Jira Cloud ID you want to use.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read.

# Limitations and notes for Jira Cloud connector


The following are limitations or notes for the Jira Cloud connector:
+  The `Contains` operator does not work with the `resourceName` field, which is of `String` data type. 
+  By default, if no explicit filter is applied, only issues from the past 30 days will be crawled. Users have the option to override this default filter by specifying a custom filter. 

# Connecting to Kustomer
Connecting to Kustomer

Kustomer is a powerful customer experience platform that brings together everything you need to serve your customers better in one easy-to-use tool.

**Topics**
+ [

# AWS Glue support for Kustomer
](kustomer-support.md)
+ [

# Policies containing the API operations for creating and using connections
](kustomer-configuring-iam-permissions.md)
+ [

# Configuring Kustomer
](kustomer-configuring.md)
+ [

# Configuring Kustomer connections
](kustomer-configuring-connections.md)
+ [

# Reading from Kustomer entities
](kustomer-reading-from-entities.md)
+ [

# Kustomer connection options
](kustomer-connection-options.md)
+ [

# Kustomer limitations
](kustomer-connection-limitations.md)

# AWS Glue support for Kustomer


AWS Glue supports Kustomer as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Kustomer.

**Supported as a target?**  
No.

**Supported Kustomer API versions**  
The following Kustomer API versions are supported:
+ v1

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring Kustomer


Before you can use AWS Glue to transfer data from Kustomer to supported destinations, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have an account with Kustomer that contains the data that you want to transfer. 
+ In the settings for your account, you've created a API key. For more information, see [Creating an API key](#kustomer-configuring-creating-an-api-key).
+ You provide the API key to AWS Glue while creating the connection.

If you meet these requirements, you’re ready to connect AWS Glue to your Kustomer account.

## Creating an API key


To create an API key that you will use to create a connection for the Kustomer connector in AWS Glue Studio:

1. Log in to the [Kustomer dashboard using your credentials.](https://amazon-appflow.kustomerapp.com/login)

1. Choose the **Settings** icon from the left menu.

1. Expand the **Security** drop down and select **API Keys**.

1. In the API Key creation page select **Add an API Key** from the top right corner.

1. Fill the mandatory inputs for the API key being created.
   + Name: any name for your API Key.
   + Roles: 'org' must be selected for the Kustomer APIs to function.
   + Expires (in days): the number of days you want the API key to be valid. You can keep it as **Never expires**, if it suits your use case.

1. Choose **Create**.

1. Store the API key (token) value for further usage to create a connection for the Kustomer connector in AWS Glue Studio.

# Configuring Kustomer connections


To configure a Kustomer connection:

1. In AWS Secrets Manager, create a secret with the following details:

   1. For the customer managed connected app, the Secret should contain the connected app Consumer Secret with `apiKey` as the key.

   1. Note: you must create a secret for your connections in AWS Glue.

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. Under **Connections**, choose **Create connection**.

   1. When selecting a **Data Source**, select Kustomer.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

# Reading from Kustomer entities


**Prerequisite**

A Kustomer object you would like to read from. You will need the object name such as Brands or Cards. The following table shows the supported entities.

**Supported entities for source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Brands | No | Yes | No | Yes | No | 
| Cards | No | Yes | No | Yes | No | 
| Chat Settings | No | No | No | Yes | No | 
| Companies | Yes | Yes | Yes | Yes | Yes | 
| Conversations | Yes | Yes | Yes | Yes | Yes | 
| Customers | Yes | Yes | Yes | Yes | Yes | 
| Customer Searches Pinned | No | Yes | No | Yes | No | 
| Customer Searches Position | No | No | No | Yes | No | 
| Email Hooks | No | Yes | No | Yes | No | 
| Web Hooks | No | Yes | No | Yes | No | 
| KB Articles | No | Yes | No | Yes | No | 
| KB Categories | No | Yes | No | Yes | No | 
| KB Forms | No | Yes | No | Yes | No | 
| KB Routes | No | Yes | No | Yes | No | 
| KB Tags | No | Yes | No | Yes | No | 
| KB Templates | No | Yes | No | Yes | No | 
| KB Themes | No | Yes | No | Yes | No | 
| Klasses | No | Yes | No | Yes | No | 
| KViews | No | Yes | No | Yes | No | 
| Messages | Yes | Yes | Yes | Yes | Yes | 
| Notes | Yes | Yes | Yes | Yes | Yes | 
| Notifications | No | Yes | No | Yes | No | 

**Example**:

```
Kustomer_read = glueContext.create_dynamic_frame.from_options(
    connection_type="kustomer",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "brands",
        "API_VERSION": "v1"
    }
```

## Kustomer entity and field details


For more information about the entities and field details see:
+ [Brands](https://api.kustomerapp.com/v1/brands)
+ [Cards](https://api.kustomerapp.com/v1/cards)
+ [Chat Settings](https://api.kustomerapp.com/v1/chat/settings)
+ [Companies](https://api.kustomerapp.com/v1/companies)
+ [Conversations](https://api.kustomerapp.com/v1/conversations)
+ [Customers](https://api.kustomerapp.com/v1/customers)
+ [Customers Searches Pinned](https://api.kustomerapp.com/v1/customers/searches/pinned)
+ [Customer Searches Positions](https://api.kustomerapp.com/v1/customers/searches/positions)
+ [Hooks Email](https://api.kustomerapp.com/v1/hooks/email)
+ [Hooks Web](https://api.kustomerapp.com/v1/hooks/web)
+ [KB Articles](https://api.kustomerapp.com/v1/kb/articles)
+ [KB Categories](https://api.kustomerapp.com/v1/kb/categories)
+ [KB Forms]( https://api.kustomerapp.com/v1/kb/forms)
+ [KB Routes](https://api.kustomerapp.com/v1/kb/routes)
+ [KB Tags](https://api.kustomerapp.com/v1/kb/tags)
+ [KB Templates](https://api.kustomerapp.com/v1/kb/templates)
+ [KB Themes](https://api.kustomerapp.com/v1/kb/themes)
+ [Klasses](https://api.kustomerapp.com/v1/klasses)
+ [Kviews](https://api.kustomerapp.com/v1/kviews)
+ [Messages](https://api.kustomerapp.com/v1/messages)
+ [Notes](https://api.kustomerapp.com/v1/notes)
+ [Notifications](https://api.kustomerapp.com/v1/notifications)

Kustomer API v1

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/kustomer-reading-from-entities.html)

## Partitioning queries


**Field-based partitioning**

You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an **inclusive** lower bound value of the chosen partition field.

  For the DateTime field, we accept the value in ISO format.

  Example of valid value:

  ```
  "2023-01-15T11:18:39.205Z"
  ```
+ `UPPER_BOUND`: an **exclusive** upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`: the number of partitions.

Entity-wise partitioning field support details are captured in the following table:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/kustomer-reading-from-entities.html)

Example:

```
Kustomer_read = glueContext.create_dynamic_frame.from_options(
    connection_type="kustomer",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "conversation",
        "API_VERSION": "v1",
        "PARTITION_FIELD": "createdAt"
        "LOWER_BOUND": "2023-01-15T11:18:39.205Z"
        "UPPER_BOUND": "2023-02-15T11:18:39.205Z"
        "NUM_PARTITIONS": "2"
    }
```

# Kustomer connection options


The following are connection options for Kustomer:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in Kustomer.
+ `API_VERSION`(String) - (Required) Used for Read. Kustomer Rest API version you want to use.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `PARTITION_FIELD`(String) - Used for read. Field to be used to partition query.
+ `LOWER_BOUND`(String)- Used for read. An inclusive lower bound value of the chosen partition field.
+ `UPPER_BOUND`(String) - Used for read. An exclusive upper bound value of the chosen partition field. 
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for read. Number of partitions for read.
+ `INSTANCE_URL`(String) - (Required) Used for Read. Kustomer instance URL.

# Kustomer limitations


The following are limitations or notes for Kustomer:
+ The `Customer Searches` entity is not supported since the Kustomer API documentation has not declared any endpoint for it.
+ The support of filtration and incremental transfer on the `Klasses` entity is not supported.
+ Order by can be supported on multiple applicable fields in a single request.

  However, the order by functionality on multiple fields has been observed to behave inconsistently from the SaaS end for some combinations. It is unpredictable as there could be 'n' combinations that might show incorrect sorting results. For example:

  For the `Customers` entity, order by `progressiveStatus desc, name asc` doesn't yield the correct sorted result. It sorts only based on the `progressiveStatus` order. If such a behaviour is observed, you can use a single field to be ordered upon.
+ Order by on the field 'id' is only supported by the `Conversations` and `Messages` entities as a query parameter. For example: https://api.kustomerapp.com/v1/conversations?sort=desc (This sorts the results by 'id' in descending order.)

  Additionally, any other filter or ordering on any other field is translated into a POST request body having the API endpoint as POST https://api.kustomerapp.com/v1/customers/search To allow the support of ordering by 'id' in `Conversations` and `Messages`, either only order by id should be present or any other filter and/or order by on any other applicable field.
+ Kustomer allows a maximum of 10K records to be fetched irrespective of a filtered or a non-filtered request. Due to this limitation, there will be data loss for any entity holding more than 10K records. There are two possible workarounds that you can perform to partially mitigate this:
  + Apply filters to fetch a specific set of records.
  + If there are more than 10K records with an applied filter, apply a successive filter value in a new subsequent request or apply ranges in filters. For example: 

    1st request's filterExpression: `modifiedAt >= 2022-03-15T05:26:23.000Z and modifiedAt < 2023-03-15T05:26:23.000Z`

    Assume this exhausts the 10K record limit.

    Another request can be triggered with filterExpression: `modifiedAt >= 2023-03-15T05:26:23.000Z`
+ As a SaaS behavior, the `CONTAINS` operator in Kustomer supports matching only on complete words and not partial matches within a word. For example: "body CONTAINS 'test record'" will match a record having 'test' in the 'body' field. However, "body CONTAINS 'test'" will not match a record having 'testAnotherRecord' in the 'body' field.

# Connecting to LinkedIn
Connecting to LinkedIn

LinkedIn is a paid marketing tool that offers access to LinkedIn social networks through various sponsored posts and other methods. LinkedIn is a powerful marketing tool for B2B companies to build leads, online recognition, share content, and more.

**Topics**
+ [

# AWS Glue support for LinkedIn
](linkedin-support.md)
+ [

# Policies containing the API operations for creating and using connections
](linkedin-configuring-iam-permissions.md)
+ [

# Configuring LinkedIn
](linkedin-configuring.md)
+ [

# Configuring LinkedIn connections
](linkedin-configuring-connections.md)
+ [

# Reading from LinkedIn entities
](linkedin-reading-from-entities.md)
+ [

# LinkedIn connection options
](linkedin-connection-options.md)
+ [

# Creating a LinkedIn account
](linkedin-create-account.md)
+ [

# Limitations
](linkedin-connector-limitations.md)

# AWS Glue support for LinkedIn


AWS Glue supports LinkedIn as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from LinkedIn.

**Supported as a target?**  
No.

**Supported LinkedIn API versions**  
** 202406 (June 2024) **

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the preceding method, alternatively, use the following managed IAM policies:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring LinkedIn


Before you can use AWS Glue to transfer from LinkedIn, you must meet the following requirements:

## Minimum requirements

+ You have a LinkedIn account. For more information about creating an account, see [Creating a LinkedIn account](linkedin-create-account.md). 
+ Your LinkedIn account is enabled for API access. 
+ You have created a `OAuth2 API` integration in your LinkedIn account. This integration provides the client credentials that AWS Glue uses to access your data securely when it makes authenticated calls to your account. For more information, see [Creating a LinkedIn account](linkedin-create-account.md).

If you meet these requirements, you’re ready to connect AWS Glue to your LinkedIn account. For typical connections, you don't need do anything else in LinkedIn.

# Configuring LinkedIn connections


 LinkedIn supports `AUTHORIZATION_CODE` grant type for OAuth2. 

This grant type is considered “three-legged” `OAuth` as it relies on redirecting users to the third-party authorization server to authenticate the user. Users may opt to create their own connected app in LinkedIn and provide their own client ID and client secret when creating connections through the AWS Glue console. In this scenario, they will still be redirected to LinkedIn to login and authorize AWS Glue to access their resources. 

This grant type results in both a refresh token and an access token. The access token expires 60 days after creation. A new access token can be obtained using the refresh token.

For public LinkedIn documentation on creating a connected app for `Authorization Code OAuth` flow, see [Authorization Code Flow (3-legged OAuth)](https://learn.microsoft.com/en-us/linkedin/shared/authentication/authorization-code-flow?toc=%2Flinkedin%2Fmarketing%2Ftoc.json&bc=%2Flinkedin%2Fbreadcrumb%2Ftoc.json&view=li-lms-2024-07&tabs=HTTPS1).

**Configuring a LinkedIn connection**

1.  In AWS Secrets Manager, create a secret with the following details: 
   + For customer managed connected app – Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key. 
   + For AWS Managed connected app – Empty secret or secret with some temporary value.
**Note**  
It is a must to create a secret per connection in AWS Glue.

1. In the AWS Glue Studio, create a connection under **Data Connections** by following the steps below: 

   1. When selecting a **Connection type**, select **LinkedIn**.

   1. Provide the LinkedIn environment.

   1. Select the IAM role for which AWS Glue can assume and has permissions for following actions: 

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. 

   1. Select the **Network options** if you want to use your network. 

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`. 

# Reading from LinkedIn entities


**Prerequisites** 

A LinkedIn Object you would like to read from. Refer the supported entities table below to check the available entities. 

 **Supported entities** 


| Entity | Can be Filtered | Supports Limit | Supports Order By | Supports Select \$1 | Supports Partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Ad Accounts | Yes | Yes | Yes | Yes | No | 
| Campaigns | Yes | Yes | Yes | Yes | No | 
| Campaign Groups | Yes | Yes | Yes | Yes | No | 
| Creatives | Yes | Yes | Yes | Yes | No | 
| Ad Analytics | Yes | No | No | Yes | No | 
| Ad Analytics All AdAcocunts | Yes | No | No | Yes | No | 
| Ad Analytics All Campaigns | Yes | No | No | Yes | No | 
| Ad Analytics All CampaignGroups | Yes | No | No | Yes | No | 
| Ad Analytics All AdCreatives | Yes | No | No | Yes | No | 
| Share Statistics | Yes | No | No | Yes | No | 
| Page Statistics | Yes | No | No | Yes | No | 
| Follower Statistics | Yes | No | No | Yes | No | 

 **Example** 

```
netsuiteerp_read = glueContext.create_dynamic_frame.from_options(
    connection_type="linkedin",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "adaccounts",
        "API_VERSION": "202406"
    }
)
```


**LinkedIn entity and field details**  

|  **Field Data Type**  |  **Supported Filter Operators**  | 
| --- | --- | 
|  String  |  =  | 
|  DateTime  |  BETWEEN, =  | 
|  Numeric  |  =  | 
|  Boolean  |  =  | 

# LinkedIn connection options


The following are connection options for LinkedIn:
+ `ENTITY_NAME`(String) – (Required) Used for Read/Write. The name of your Object in LinkedIn. For example, adAccounts. 
+ `API_VERSION`(String) – (Required) Used for Read/Write. LinkedIn Rest API version you want to use. The value will be 202406, as LinkedIn currently supports only version 202406.
+ `SELECTED_FIELDS`(List<String>) – Default: empty(SELECT \$1). Used for Read. Columns you want to select for the selected entity. 
+ `FILTER_PREDICATE`(String) – Default: empty. Used for Read. It should be in the Spark SQL format. 
+ `QUERY`(String) – Default: empty. Used for Read. Full Spark SQL query. 

# Creating a LinkedIn account


**Creating a LinkedIn App and OAuth credentials**

1. Navigate to your **LinkedIn Developer Network** page and log in with your LinkedIn account credentials. 

1. Navigate to the **My Apps** page and choose **Create Application** to create a new LinkedIn App.

1. Enter the following details into the app registration form:
   + **Company Name** – Select an existing company or create a new company.
   + **Name** – Enter the application name.
   + **Description** – Enter the application description.
   + **Application Logo** – Select an image file as your application logo.
   + **Application Use** – Select the use of your application.
   + **Website URL** – Enter the websiute URL that contains detailed information about your application.
   + **Business Email** – Enter your business email address.
   + **Business Phone** – Enter your business phone number.
   + **LinkedIn API Terms of Use** – Read and agree.

1. Upon completion of the app registration form, choose **Submit**.

   You will be redirected to the **Authentication** page, where the Authentication Keys (Client ID and Client Secret) and other relevant details will be displayed.

1. If your web application requires access to the user's email address from their LinkedIn account, select the `r_emailaddress` permission. Additionally, you can specify Authorized Redirect URLs for your LinkedIn application. 

**Creating a page in LinkedIn account**

1. Navigate to [LinkedIn Developer Products](https://developer.linkedin.com/).

1. At the upper-right corner of the **LinkedIn Developer Products** page, select **My apps**.

1. At the upper-right corner of the **My apps** page, select **Create app**.

1. On the **Create an app** page, enter your app name in the **App name** field.

1. In the **LinkedIn Page** field, enter your company page name or URL.
**Note**  
If you don’t have a LinkedIn Page, you can create one by selecting **Create a new LinkedIn**. 

1. In the **Privacy policy URL** field, enter your privacy policy URL.

1. Choose **Upload a logo** to upload an image that is to be displayed to users when they authorize with your app.

1. In the **Legal agreement** section, select **I have read and agree to these terms**.

1. Choose **Create app**. 

   Your new app will be created and will be available under the **My apps** tab.

**Publishing campaign ads in LinkedIn**

1. Log in to **Campaign Manager**. 

1. Select an existing **Campaign Group**, or choose **Create** to create a new one.

1. Select your objective.

1. Select your group, budget, and schedule.

1. Build your target audience.

1. Select your ad format.

1. Select your budget and schedule.

1. Set up your ad(s).

1. Review and Launch.

# Limitations


For the Analytics fields`ad_analytics_all_adAccounts`, `ad_analytics_all_campaigns`, `ad_analytics_all_campaign_groups`, and `ad_analytics_all_adCreatives` a filter is mandatory to retrieve the records.

# Connecting to Mailchimp
Connecting to Mailchimp

Mailchimp is an all-in-one marketing platform that helps you manage and talk to your clients, customers, and other interested parties. Their approach to marketing focuses on healthy contact management practices, beautifully designed emails, unique automated workflows, and powerful data analysis. If you're a Mailchimp user, you can connect AWS Glue to your Mailchimp account. Then, you can use Mailchimp as a data source in your ETL jobs. Run these jobs to transfer data between Mailchimp and AWS services or other supported applications.

**Topics**
+ [

# AWS Glue support for Mailchimp
](mailchimp-support.md)
+ [

# Policies containing the API operations for creating and using connections
](mailchimp-configuring-iam-permissions.md)
+ [

# Configuring Mailchimp
](mailchimp-configuring.md)
+ [

# Configuring Mailchimp connections
](mailchimp-configuring-connections.md)
+ [

# Reading from Mailchimp entities
](mailchimp-reading-from-entities.md)
+ [

# Mailchimp connection options
](mailchimp-connection-options.md)
+ [

# Creating an Mailchimp account
](mailchimp-create-account.md)
+ [

# Limitations
](mailchimp-connector-limitations.md)

# AWS Glue support for Mailchimp


AWS Glue supports Mailchimp as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Mailchimp.

**Supported as a target?**  
No.

**Supported Mailchimp API versions**  
 3.0 

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the preceding method, alternatively, use the following managed IAM policies:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring Mailchimp


Before you can use AWS Glue to transfer from Mailchimp, you must meet the following requirements:

## Minimum requirements

+ You have an Mailchimp account with email and password. For more information about creating an account, see [Creating a Mailchimp account](mailchimp-create-account.md). 
+  You must have AWS Account created with the service access to AWS Glue. 
+ Ensure you have created one of the following resources. These resources provide credentials that AWS Glue uses to securely access your data when making authenticated calls to your account:
  + A Developer App that supports OAuth 2.0 authentication. For more information about creating a Developer App, see [Creating a Mailchimp account](mailchimp-create-account.md). 

If you meet these requirements, you’re ready to connect AWS Glue to your Mailchimp account. For typical connections, you don't need do anything else in Mailchimp.

# Configuring Mailchimp connections


 Mailchimp supports following two types for authentication mechanism: 
+ Mailchimp supports the `AUTHORIZATION_CODE` grant type.
  + This grant type is considered “three-legged” `OAuth` as it relies on redirecting users to the third party authorization server to authenticate the user. It is used when creating connections via the AWS Glue Console. The user creating a connection may by default rely on a AWS Glue owned connected app where they do not need to provide any `OAuth` related information except for their Mailchimp Client ID and Client Secret. The AWS Glue Console will redirect the user to Mailchimp where the user must login and allow AWS Glue the requested permissions to access their Mailchimp instance.
  + Users may still opt to create their own connected app in Mailchimp and provide their own Client ID and Client Secret when creating connections through the AWS Glue Console. In this scenario, they will still be redirected to Mailchimp to login and authorize AWS Glue to access their resources.
  + For public Mailchimp documentation on creating a connected app for `AUTHORIZATION_CODE OAuth` flow, see [ Access Data on Behalf of Other Users with OAuth 2 ](https://mailchimp.com/developer/marketing/guides/access-user-data-oauth-2/?msockid=141ebf9ffb4d619525c3ad27fad660d6). 
+ **Custom Auth** – For public Mailchimp documentation about generating the required API keys for custom authorization, see [ About API Keys](https://mailchimp.com/en/help/about-api-keys/?msockid=310fd0fe09d16afe034fc5de08d76b01). 



To configure a Mailchimp connection:

1. In AWS Secrets Manager, create a secret with the following details: 
   + `OAuth` auth – For customer managed connected app: Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key. 
   + Custom auth – For customer managed connected app: Secret should contain the connected app Consumer Secret with “api\$1key” as key. 
**Note**  
It is a must to create a secret per connection in AWS Glue.

1. In AWS Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. Under **Connections**, select **Create connection**. 

   1. When selecting a **Data Source**, select Mailchimp.

   1. Provide the Mailchimp `instanceUrl`.

   1. Select the IAM role for which AWS Glue can assume and has permissions for following actions: 

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select Authentication Type to connect to Mailchimp:
      + For `OAuth` auth – Provide the Token URL, User Managed Client Application ClientId of the Mailchimp that you want to connect to.
      + For Custom auth – Select Authentication Type CUSTOM to connect to Mailchimp.

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. 

   1.  Select the network options if you want to use your network. 

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`. 

1. In your AWS Glue job configuration, provide `connectionName` as an Additional network connection.

# Reading from Mailchimp entities


 **Prerequisites** 

A Mailchimp Object you would like to read from. Refer the supported entities table below to check the available entities. 

 **Supported entities** 
+ [Abuse-reports ](https://mailchimp.com/developer/marketing/api/campaign-abuse/)
+ [Automation](https://mailchimp.com/developer/marketing/api/automation/list-automations/)
+ [Campaigns](https://mailchimp.com/developer/marketing/api/campaigns/list-campaigns/)
+ [Click-details](https://mailchimp.com/developer/marketing/api/link-clickers/)
+ [Lists](https://mailchimp.com/developer/marketing/api/link-clickers/)
+ [Members](https://mailchimp.com/developer/marketing/api/list-segment-members/)
+ [Open-details](https://mailchimp.com/developer/marketing/api/list-members/)
+ [Segments](https://mailchimp.com/developer/marketing/api/list-segments/)
+ [Stores](https://mailchimp.com/developer/marketing/api/ecommerce-stores/list-stores/)
+ [Unsubscribed](https://mailchimp.com/developer/marketing/api/unsub-reports/)


| Entity | Can be Filtered | Supports Limit | Supports Order By | Supports Select \$1 | Supports Partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Automation | Yes | Yes | Yes | Yes | Yes | 
| Campaigns | No | No | No | No | No | 
| Lists | Yes | Yes | No | Yes | Yes | 
| Reports Abuse | No | Yes | No | Yes | Yes | 
| Reports Open | No | Yes | No | Yes | Yes | 
| Reports Click | Yes | Yes | No | Yes | Yes | 
| Reports Unsubscribe | No | Yes | No | Yes | Yes | 
| Segment | No | Yes | No | Yes | Yes | 
| Segment Members | Yes | Yes | No | Yes | No | 
| Stores | Yes | Yes | Yes | Yes | No | 

 **Example** 

```
mailchimp_read = glueContext.create_dynamic_frame.from_options(                    
            connection_type="mailchimp",                                           
            connection_options={                                                        
                  "connectionName": "connectionName",                                   
                  "ENTITY_NAME": "stores",  
"INSTANCE_URL": "https://us14.api.mailchimp.com",                     
                  "API_VERSION": "3.0"                                                
               })
```

 **Mailchimp entity and field details** 
+ [Abuse-reports ](https://mailchimp.com/developer/marketing/api/campaign-abuse/)
+ [Automation](https://mailchimp.com/developer/marketing/api/automation/list-automations/)
+ [Campaigns](https://mailchimp.com/developer/marketing/api/campaigns/list-campaigns/)
+ [Click-details](https://mailchimp.com/developer/marketing/api/link-clickers/)
+ [Lists](https://mailchimp.com/developer/marketing/api/link-clickers/)
+ [Members](https://mailchimp.com/developer/marketing/api/list-segment-members/)
+ [Open-details](https://mailchimp.com/developer/marketing/api/list-members/)
+ [Segments](https://mailchimp.com/developer/marketing/api/list-segments/)
+ [Stores](https://mailchimp.com/developer/marketing/api/ecommerce-stores/list-stores/)
+ [Unsubscribed](https://mailchimp.com/developer/marketing/api/unsub-reports/)

## Partitioning queries


You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an **inclusive** lower bound value of the chosen partition field.

  For the DateTime field, we accept the value in ISO format.

  Example of valid value:

  ```
  "2024-07-01T00:00:00.000Z"
  ```
+ `UPPER_BOUND`: an **exclusive** upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`: the number of partitions.

The following table describes the entity partitioning field support details:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/mailchimp-reading-from-entities.html)

Example:

```
read_read = glueContext.create_dynamic_frame.from_options(
    connection_type="mailchimp",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "automations",
        "API_VERSION": "3.0",
        "INSTANCE_URL": "https://us14.api.mailchimp.com",
        "PARTITION_FIELD": "create_time",
        "LOWER_BOUND": "2024-02-05T14:09:30.115Z",
        "UPPER_BOUND": "2024-06-07T13:30:00.134Z",
        "NUM_PARTITIONS": "3"
    }
```

# Mailchimp connection options


The following are connection options for Mailchimp:
+  `ENTITY_NAME`(String) – (Required) Used for Read/Write. The name of your Object in Mailchimp. 
+ `INSTANCE_URL`(String) - (Required) A valid Mailchimp Instance URL.
+ `API_VERSION`(String) - (Required) Used for Read. Mailchimp Engage Rest API version you want to use. For example: 3.0.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query.
+ `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field.
+ `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field. 
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read.

# Creating an Mailchimp account


1. Navigate to the [Mailchimp login page](https://login.mailchimp.com/?locale=en), enter your email ID and password, and then choose **Sign up**.

1. Open the confirmation email from Mailchimp, and choose the confirmation link to verify your account.
**Note**  
The time it takes to receive the activation email may vary. If you haven't received the activation email, check your spam folder, and read our activation email troubleshooting tips. Mailchimp blocks signups from role-based email addresses such as, [admin@pottedplanter.com](mailto:admin@pottedplanter.com) or [security@example.com](mailto:security@example.com).  


   The first time you log in to your account, Mailchimp asks for required information. Mailchimp uses this information to help ensure your account is compliant with their Terms of Use and to provide guidance that's relevant to you and your company's needs.

1. Enter your information, follow the prompts to finish the activation process, and get started in your new Mailchimp account.

**Registering an `OAuth2.0` application**

1. Navigate to the [Mailchimp login page](https://login.mailchimp.com/?locale=en), enter your email ID and password, and choose **Log in**. 

1. Select the **User** icon in the upper-right corner, and then choose **Account and billing** from the dropdown menu.

1. Select **Extras** and choose **Registered apps** from the dropdown menu.

1. Locate and choose **Register An App**.

1. Enter the following details:
   + **App name** – Name of the app. 
   + **Company / Organization** – Name of your Company or Organization.
   + **App website** – Website of the app.
   + **Redirect URI** – A Redirect URI pattern is a URI path (or comma-separated list of paths) to which Mailchimp can redirect (if requested) when the login flow is complete. For example, `https://ap-southeast-2\\.console\\.aws\\.amazon\\.com`

1. Choose **Create**. 

1. The **Client ID** and **Client Secret** will now be visible. Copy and save them in a secure location. Then, choose **Done**. 
**Note**  
Your Client ID and Client Secret strings are credentials used to establish a connection with this connector when using AppFlow or AWS Glue.

**Generating an API key**

1. Navigate to the [Mailchimp login page](https://login.mailchimp.com/?locale=en), enter your email ID and password, and choose **Log in**. 

1. Select the **User** icon in the upper-right corner, and then choose **Account and billing** from the dropdown menu.

1. Select **Extras** and choose **API keys** from the dropdown menu.

1. Choose **Create A Key**.

1. Enter a name for the key and choose **Generate Key**.

   The next page displays the generated API key. 

1. Copy your key, store it securely, and choose **Done**.

# Limitations


The following are limitations for the Mailchimp connector:
+ Filtration is only supported by `Campaigns`, `Automations`, `Lists`, `Open Details`, `Members`, and `Segments` entities.
+ While using a filter on `DateTime` datatype field, you must pass values in this format: `yyyy-mm-ddThh:MM:ssZ`

# Connecting to Microsoft Dynamics 365 CRM


 Microsoft Dynamics 365 is a product line of enterprise resource planning and customer relationship management intelligent business applications. 

**Topics**
+ [

# AWS Glue support for Microsoft Dynamics 365
](microsoft-dynamics-365-support.md)
+ [

# Policies containing the API operations for creating and using connections
](microsoft-dynamics-365-configuring-iam-permissions.md)
+ [

# Configuring Microsoft Dynamics 365 CRM
](microsoft-dynamics-365-configuring.md)
+ [

# Configuring Microsoft Dynamics 365 CRM connections
](microsoft-dynamics-365-configuring-connections.md)
+ [

# Reading from Microsoft Dynamics 365 CRM entities
](microsoft-dynamics-365-reading-from-entities.md)
+ [

# Microsoft Dynamics 365 CRM connection option reference
](microsoft-dynamics-365-connection-options.md)
+ [

# Limitations
](microsoft-dynamics-365-connector-limitations.md)

# AWS Glue support for Microsoft Dynamics 365


AWS Glue supports Microsoft Dynamics 365 as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Microsoft Dynamics 365.

**Supported as a target?**  
No.

**Supported Microsoft Dynamics 365 CRM API versions**  
 v9.2. 

# Policies containing the API operations for creating and using connections
IAM policies

 The following sample policy describes the required AWS permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

You can also use the following managed IAM policies to allow access:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring Microsoft Dynamics 365 CRM


Before you can use AWS Glue to transfer data from Microsoft Dynamics 365 CRM, you must meet these requirements:

## Minimum requirements

+  You have a Microsoft Dynamics 365 CRM developer account with ClientId and Secret. 
+  Your Microsoft Dynamics 365 CRM account has API access with a valid license. 

 If you meet these requirements, you’re ready to connect AWS Glue to your Microsoft Dynamics 365 CRM account. For typical connections, you don't need do anything else in Microsoft Dynamics 365 CRM. 

# Configuring Microsoft Dynamics 365 CRM connections


 **AUTHORIZATION\$1CODE Grant Type** 
+  This grant type is considered “three-legged” OAuth as it relies on redirecting users to the third party authorization server to authenticate the user. It is used when creating connections via the AWS Glue Console. The AWS Glue Console will redirect the user to Microsoft Dynamics 365 CRM where the user must login and allow AWS Glue the requested permissions to access their Microsoft Dynamics 365 CRM instance. 
+  Users may opt to create their own connected app in Microsoft Dynamics 365 CRM and provide their own client id and client secret when creating connections through the AWS Glue Console. In this scenario, they will still be redirected to Microsoft Dynamics 365 CRM to login and authorize AWS Glue to access their resources. 
+  This grant type results in a refresh token and access token. The access token is short lived, and may be refreshed automatically without user interaction using the refresh token. 
+  For public Microsoft Dynamics 365 CRM documentation on creating a connected app for Authorization Code OAuth flow, see \$1 Microsoft Learn. [Microsoft App Registration](https://learn.microsoft.com/en-us/power-apps/developer/data-platform/authenticate-oauth#app-registration). 

Microsoft Dynamics 365 CRM supports OAuth2.0 authentication.

To configure a Microsoft Dynamics 365 CRM connection:

1.  In AWS Secrets Manager, create a secret with the following details. It is required to create a secret for each connection in AWS Glue. 
   +  For AuthorizationCode grant type: 

      For customer managed connected app - Secret should contain the connected app Client Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key. 

1. In AWS Glue Studio, create a connection under **Data Connections** by following the steps below: 

   1. When selecting a **Data Source**, select Microsoft Dynamics 365 CRM.

   1. Select the **INSTANCE\$1URL** of the Microsoft Dynamics 365 CRM instance you want to connect to.

   1.  Select the IAM role that AWS Glue can assume and has permissions for following actions: 

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1.  Select **Token URL** and **Authorization Code URL** to access your Microsoft Dynamics 365 CRM workspace. 

   1.  Provide **User Managed Client Application ClientId** of your Microsoft Dynamics 365 CRM app. 

   1.  Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. 

   1.  Select the network options if you want to use your network. 

1.  Grant the IAM role associated with your AWS Glue job permission to read `secretName`. Choose **Next**. 

1.  In your AWS Glue job configuration, provide `connectionName` as an **Additional network connection**. 

# Reading from Microsoft Dynamics 365 CRM entities


 **Prerequisites** 
+  A Microsoft Dynamics 365 CRM object you would like to read from. You will need the object name such as contacts or accounts. The following table shows the supported entities. 

 **Supported entities** 


| Entity | Can be Filtered | Supports Limit | Supports Order By | Supports Select \$1 | Supports Partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Dynamic entity | Yes | Yes | Yes | Yes | Yes | 

 **Example** 

```
dynamics365_read = glueContext.create_dynamic_frame.from_options(
    connection_type="microsoftdynamics365crm",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "dynamic_entity",
        "API_VERSION": "v9.2",
        "INSTANCE_URL": "https://{tenantID}.api.crm.dynamics.com"
    }
```

## Microsoft Dynamics 365 CRM Entity and Field Details


 **Entities with dynamic metadata:** 

Microsoft Dynamics 365 CRM provides endpoints to fetch metadata dynamically. Therefore, for dynamic entities, the operator support is captured at datatype level.

<a name="microsoft-dynamics-365-metadata-table"></a>[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/microsoft-dynamics-365-reading-from-entities.html)

 **Partitioning queries** 

Microsoft Dynamics 365 CRM supports only field based partitioning.

 Additional spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, `NUM_PARTITIONS` can be provided if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by spark tasks concurrently. 
+  `PARTITION_FIELD`: the name of the field to be used to partition query. 
+  `LOWER_BOUND`: an inclusive lower bound value of the chosen partition field. 

   For Datetime, we accept the Spark timestamp format used in Spark SQL queries. Example of valid values: `"2024-01-30T06:47:51.000Z"`. 
+  `UPPER_BOUND`: an exclusive upper bound value of the chosen partition field. 
+  `NUM_PARTITIONS`: number of partitions. 

 Entity wise partitioning field support details are captured in below table: 


| Entity Name | Partitioning Fields | DataType | 
| --- | --- | --- | 
| Dynamic Entity (Standard entity) | Dynamic DateTime fields which are queryable | createdon, modifiedon | 
| Dynamic Entity (Custom entity) | createdon, modifiedon | createdon, modifiedon | 

 **Example** 

```
dynamics365_read = glueContext.create_dynamic_frame.from_options(
    connection_type="microsoftdynamics365crm",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "dynamic_entity",
        "API_VERSION": "v9.2",
        "instanceUrl": "https://{tenantID}.api.crm.dynamics.com"
        "PARTITION_FIELD": "createdon"
        "LOWER_BOUND": "2024-01-30T06:47:51.000Z"
        "UPPER_BOUND": "2024-06-30T06:47:51.000Z"
        "NUM_PARTITIONS": "10"
    }
```

# Microsoft Dynamics 365 CRM connection option reference


The following are connection options for Microsoft Dynamics 365 CRM:
+  `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in Microsoft Dynamics 365 CRM. 
+  `API_VERSION`(String) - (Required) Used for Read. Microsoft Dynamics 365 CRM Rest API version you want to use. 
+  `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object. 
+  `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format. 
+  `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query. 
+  `INSTANCE_URL`(String) - (Required) A valid Microsoft Dynamics 365 CRM Instance URL with the format: `https://{tenantID}.api.crm.dynamics.com` 
+  `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read. 
+  `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query. 
+  `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field. Example: `2024-01-30T06:47:51.000Z`. 
+  `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field. Example: `2024-06-30T06:47:51.000Z`. 

# Limitations


The following are limitations for the Microsoft Dynamics 365 CRM connector:
+  In Microsoft Dynamics 365 CRM, record based partitioning is not supported as it does not support an offset parameter, and thus record based partitioning cannot be supported. 
+  Pagination is set at a maximum of 500 records per page to avoid Internal Server exceptions from SaaS due to a combination of data size and rate limitations. 
  + [SaaS documentation on pagination](https://learn.microsoft.com/en-us/power-apps/developer/data-platform/webapi/query/page-results?view=dataverse-latest)
  + [SaaS documentation on rate limits](https://learn.microsoft.com/en-us/power-apps/developer/data-platform/api-limits?tabs=sdk)
+  Microsoft Dynamics 365 CRM supports `order by` on only parent fields for all entities. `order by` is not supported on sub-fields. 
  + Both ASC and DESC directions are supported.
  + `order by` on multiple fields is supported.
+  Filtering on "createddatetime" field of the `aadusers` standard entity is throwing a bad request error from SaaS even though it supports filtration. There is no specific identification of any other entity having a similar issue due to the dynamic nature of the metadata and neither is the root cause known. Hence, it cannot be handled. 
+  Complex object types, such as Struct, List, and Map do not support filtration. 
+  Many fields that can be retrieved from a response have `isRetrievable` marked as `false` in dynamic metadata response. In order to avoid data loss, `isRetrievable` is set to `true` for all fields. 
+  Field based partitioning will be supported on all entities when it adheres to the following criteria: 
  + DateTime queryable fields should be present in standard entities or `createdon` and `modifiedon` fields (system-generated) in custom entities. 
  + There is no exclusive identification of system generated fields or the nullable property from any SaaS metadata APIs, however it is a general practice that only the fields available by default, are filterable and non-nullable. Therefore, the above criterion of field selection is considered null safe and if it is filterable, it will be eligible for partitioning.

# Connecting to Microsoft Teams


 Microsoft Teams is a collaborative workspace within Microsoft 365 that acts as a central hub for workplace conversations, collaborative teamwork, video chats and document sharing, all designed to aid worker productivity in a unified suite of tools. 

**Topics**
+ [

# AWS Glue support for Microsoft Teams
](microsoft-teams-support.md)
+ [

# Policies containing the API operations for creating and using connections
](microsoft-teams-configuring-iam-permissions.md)
+ [

# Configuring Microsoft Teams
](microsoft-teams-configuring.md)
+ [

# Configuring Microsoft Teams connections
](microsoft-teams-configuring-connections.md)
+ [

# Reading from Microsoft Teams entities
](microsoft-teams-reading-from-entities.md)
+ [

# Microsoft Teams connection option reference
](microsoft-teams-connection-options.md)
+ [

# Limitations
](microsoft-teams-connector-limitations.md)
+ [

## Create a new Microsoft Teams account:
](#microsoft-teams-account-creation)

# AWS Glue support for Microsoft Teams


AWS Glue supports Microsoft Teams as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Microsoft Teams.

**Supported as a target?**  
No.

**Supported Microsoft Teams API versions**  
 v1. For entity support per version specific, see Supported entities for source. 

# Policies containing the API operations for creating and using connections
IAM policies

 The following sample policy describes the required AWS permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

You can also use the following managed IAM policies to allow access:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring Microsoft Teams


Before you can use AWS Glue to transfer data from Microsoft Teams, you must meet these requirements:

## Minimum requirements

+  You have a Microsoft Teams developer account with Email and Password. For more information, see [Create a new Microsoft Teams account:](connecting-to-microsoft-teams.md#microsoft-teams-account-creation). 
+  You should have setup an OAuth2 app in your Microsoft account that provides the client ID and secret credentials that AWS Glue uses to access your data securely when it makes authenticated calls to your account. For more information, see [Create a new Microsoft Teams account:](connecting-to-microsoft-teams.md#microsoft-teams-account-creation). 

 If you meet these requirements, you’re ready to connect AWS Glue to your Microsoft Teams account. For typical connections, you don't need do anything else in Microsoft Teams. 

# Configuring Microsoft Teams connections


Microsoft Teams supports following two types for authentication mechanism:

1.  OAuth Auth: Microsoft Teams supports AUTHORIZATION\$1CODE grant type for OAuth2. 
   +  This grant type is considered “three-legged” OAuth as it relies on redirecting users to the third party authorization server to authenticate the user. It is used when creating connections via the AWS Glue Console. The user creating a connection may by default rely on a AWS Glue-owned connected app where they do not need to provide any OAuth related information except for the Microsoft Teams instanceurl. The AWS Glue Console will redirect the user to Microsoft Teams where the user must login and allow AWS Glue the requested permissions to access their Microsoft Teams instance. 
   +  Users may opt to create their own connected app in Microsoft Teams and provide their own client id and client secret when creating connections through the AWS Glue Console. In this scenario, they will still be redirected to Microsoft Teams to login and authorize AWS Glue to access their resources. 
   +  This grant type results in a refresh token and access token. The access token is active for one hour, and may be refreshed automatically without user interaction using the refresh token. 
   +  For public Microsoft Teams documentation on creating a connected app for Authorization Code OAuth flow, see \$1 Microsoft Learn. [Register an application with the Microsoft identity platform - Microsoft Graph](https://learn.microsoft.com/en-us/graph/auth-register-app-v2). 

To configure a Microsoft Teams connection:

1.  In AWS Secrets Manager, create a secret with the following details. It is required to create a secret for each connection in AWS Glue. 

   1.  For OAuth auth: 
      +  For customer managed connected app - Secret should contain the connected app Consumer Secret with USER\$1MANAGED\$1CLIENT\$1APPLICATION\$1CLIENT\$1SECRET as key. 

1. In AWS Glue Studio, create a connection under **Data Connections** by following the steps below: 

   1.  Under Data Connections, choose **Create connection**. 

   1. When selecting a **Data Source**, select Microsoft Teams.

   1. Provide your Microsoft Teams **Tenant ID**.

   1.  Select the IAM role that AWS Glue can assume and has permissions for following actions: 

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1.  Provide User Managed Client Application ClientId of Microsoft Teams app. 

   1.  Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. 

   1.  Select the network options if you want to use your network. 

1.  Grant the IAM role associated with your AWS Glue job permission to read `secretName`. Choose **Next**. 

1.  In your AWS Glue job configuration, provide `connectionName` as an **Additional network connection**. 

# Reading from Microsoft Teams entities


 **Prerequisites** 
+  A Microsoft Teams object you would like to read from. You will need the object name such as team or channel-message. The following table shows the supported entities. 

 **Supported entities for Source** 

 All entities are supported with API version 1.0. 


| Entity | Can be Filtered | Supports Limit | Supports Order By | Supports Select \$1 | Supports Partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Teams | No | No | No | Yes | No | 
| Team Members | Yes | Yes | No | Yes | Yes | 
| Groups | Yes | Yes | Yes | Yes | Yes | 
| Group Members | Yes | Yes | No | Yes | No | 
| Channels | Yes | No | No | Yes | Yes | 
| Channel Messages | No | Yes | No | Yes | No | 
| Channel Message Replies | No | Yes | No | Yes | No | 
| Channel Tabs | Yes | No | No | Yes | No | 
| Chats | Yes | Yes | Yes | Yes | Yes | 
| Calendar Events | Yes | Yes | Yes | Yes | Yes | 

 **Example** 

```
MicrosoftTeams_read = glueContext.create_dynamic_frame.from_options(
    connection_type="MicrosoftTeams",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "company",
        "API_VERSION": "v1.0"
    }
```

## Microsoft Teams Entity and Field Details


 Entities list: 
+  Team: [ https://docs.microsoft.com/en-us/graph/api/user-list-joinedteams?view=graph-rest-1.0 ](https://docs.microsoft.com/en-us/graph/api/user-list-joinedteams?view=graph-rest-1.0) 
+  Team-Member: [ https://docs.microsoft.com/en-us/graph/api/team-list-members?view=graph-rest-1.0 ](https://docs.microsoft.com/en-us/graph/api/team-list-members?view=graph-rest-1.0) 
+  Group: [ https://docs.microsoft.com/en-us/graph/api/group-list?view=graph-rest-1.0 ](https://docs.microsoft.com/en-us/graph/api/group-list?view=graph-rest-1.0) 
+  Group-Member: [ https://docs.microsoft.com/en-us/graph/api/group-list-members?view=graph-rest-1.0 ](https://docs.microsoft.com/en-us/graph/api/group-list-members?view=graph-rest-1.0) 
+  Channel: [ https://docs.microsoft.com/en-us/graph/api/channel-list?view=graph-rest-1.0 ](https://docs.microsoft.com/en-us/graph/api/channel-list?view=graph-rest-1.0) 
+  Channel-Message: [ https://docs.microsoft.com/en-us/graph/api/channel-list-messages?view=graph-rest-1.0 ](https://docs.microsoft.com/en-us/graph/api/channel-list-messages?view=graph-rest-1.0) 
+  Channel-Message-Reply: [ https://docs.microsoft.com/en-us/graph/api/chatmessage-list-replies?view=graph-rest-1.0 ](https://docs.microsoft.com/en-us/graph/api/chatmessage-list-replies?view=graph-rest-1.0) 
+  Channel-Tab: [ https://docs.microsoft.com/en-us/graph/api/channel-list-tabs?view=graph-rest-1.0 ](https://docs.microsoft.com/en-us/graph/api/channel-list-tabs?view=graph-rest-1.0) 
+  Chat: [ https://docs.microsoft.com/en-us/graph/api/chat-list?view=graph-rest-1.0 ]( https://docs.microsoft.com/en-us/graph/api/chat-list?view=graph-rest-1.0) 
+  Calendar-Event: [ https://docs.microsoft.com/en-us/graph/api/group-list-events?view=graph-rest-1.0 ](https://docs.microsoft.com/en-us/graph/api/group-list-events?view=graph-rest-1.0) 

 **Partitioning queries** 

 Additional spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, `NUM_PARTITIONS` can be provided if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by spark tasks concurrently. 
+  `PARTITION_FIELD`: the name of the field to be used to partition query. 
+  `LOWER_BOUND`: an inclusive lower bound value of the chosen partition field. 

   For date, we accept the Spark date format used in Spark SQL queries. Example of valid values: `"2024-02-06"`. 
+  `UPPER_BOUND`: an exclusive upper bound value of the chosen partition field. 
+  `NUM_PARTITIONS`: number of partitions. 

 Entity wise partitioning field support details are captured in below table: 


| Entity Name | Partitioning Fields | Data Type | 
| --- | --- | --- | 
| Team Members | visibleHistoryStartDateTime | DateTime | 
| Groups | createdDateTime | DateTime | 
| Channels | createdDateTime | DateTime | 
| Chats | createdDateTime, lastModifiedDateTime | DateTime | 
| Calendar Events | createdDateTime, lastModifiedDateTime, originalStart | DateTime | 

 **Example** 

```
microsoftteams_read = glueContext.create_dynamic_frame.from_options(
    connection_type="MicrosoftTeams",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "group",
        "API_VERSION": "v1.0",
        "PARTITION_FIELD": "createdDateTime"
        "LOWER_BOUND": "2022-07-13T07:55:27.065Z"
        "UPPER_BOUND": "2022-08-12T07:55:27.065Z"
        "NUM_PARTITIONS": "2"
    }
```

# Microsoft Teams connection option reference


The following are connection options for Microsoft Teams:
+  `ENTITY_NAME`(String) - (Required) Used for Read. The name of your Object in Microsoft Teams. 
+  `API_VERSION`(String) - (Required) Used for Read. Microsoft Teams Rest API version you want to use. Example: v1.0. 
+  `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object. 
+  `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format. 
+  `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query. 
+  `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query. 
+  `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field. 
+  `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field. 
+  `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read. 

# Limitations


The following are limitations for the Microsoft Teams connector:
+  The Microsoft Teams API returns less number of records than specified for Chat and Team Member entities. This issue is reported to Microsoft Teams Support and it is under investigation. 

## Create a new Microsoft Teams account:


1.  Navigate to Microsoft Teams’s homepage, [https://account.microsoft.com/account/](https://account.microsoft.com/account/) and choose **Sign in**. 

1.  Choose **Create one\$1**. 

1.  Enter the required information for account creation and create a new account. 

1.  Navigate to the Microsoft Teams website at [ https://www.microsoft.com/en-in/microsoft-teams/log-in](https://www.microsoft.com/en-in/microsoft-teams/log-in). 

1.  Sign up using the Microsoft Account you just created. 

1.  After successful sign up on Teams, navigate to [https://account.microsft.com/services](https://account.microsft.com/services). 

1.  Choose **Try Microsoft 365**. 

1.  Activate one of below Microsoft 365 or Microsoft Teams subscription to access all required features of Microsoft Teams connector: 
   + Microsoft Teams Essentials
   + Microsoft 365 Business
   + Microsoft 365 Business Basic
   + Microsoft 365 Business Standard
   + Microsoft 365 Business Premium

**Create a managed client app:**

1.  To create a managed application, you need to register a new OAuth app on Microsoft Entra (formerly Azure Active Directory): 

1.  Sign in to the [Microsoft Entra admin center](https://entra.microsoft.com). 

1.  If you have access to multiple tenants, use the Settings icon in the top menu to switch to the tenant in which you want to register the application from the Directories \$1 subscriptions menu. 

1.  Navigate to Identity > Applications > App registrations and select **New registration**. 

1. Enter a display Name for your application.

1.  Specify who can use the application in the Supported account types section. To make this app global select “Accounts in any organizational directory” or “Accounts in any organizational directory and personal Microsoft accounts”. 

1.  Enter Redirect URI `https://{region}.console.aws.amazon.com/appflow/oauth`. For example, for the `us-west-2 region`, add `https://us-west-2.console.aws.amazon.com/appflow/oauth`. You can add multiple URLs for different regions that you want to use.

1.  Register the app. 

1.  Note the Client ID for future use. 

1.  Choose **Add a certificate or secret** in the Essentials section. 

1.  Choose **New Client Secret**. 

1.  Enter Description and Expires duration. 

1.  Copy and save the client secret for future use. 

1.  In the left side menu list, select **API permissions**. 

1.  Choose **Add a permission**. 

1.  Select “Microsoft Graph“. 

1.  Select “Delegated permissions”. 

1.  Check all the following permissions: 
   + User.Read
   + Offline\$1access
   + User.Read.All
   + User.ReadWrite.All
   + TeamsTab.ReadWriteForTeam
   + TeamsTab.ReadWriteForChat
   + TeamsTab.ReadWrite.All
   + TeamsTab.Read.All
   + TeamSettings.ReadWrite.All
   + TeamSettings.Read.All
   + TeamMember.ReadWrite.All
   + TeamMember.Read.All
   + Team.ReadBasic.All
   + GroupMember.ReadWrite.All
   + GroupMember.Read.All
   + Group.ReadWrite.All
   + Group.Read.All
   + Directory.ReadWrite.All
   + Directory.Read.All
   + Directory.AccessAsUser.All
   + Chat.ReadWrite
   + Chat.ReadBasic
   + Chat.Read
   + ChannelSettings.ReadWrite.All
   + ChannelSettings.Read.All
   + ChannelMessage.Read.All
   + Channel.ReadBasic.All

1.  Choose **Add permissions**. Your app is now setup successfully. You can use the client ID and client secret to create a new connection. For more information, see [https://learn.microsoft.com/en-us/graph/auth-register-app-v2](https://learn.microsoft.com/en-us/graph/auth-register-app-v2). 

# Connecting to Mixpanel
Connecting to Mixpanel

Mixpanel is a powerful real-time analytics platform that helps companies measure and optimize user engagement. Mixpanel is an app used for tracking customer behavior. It enables you to track how users engage with your product and analyze this data with interactive reports that let you query and visualize the results with just a few clicks. As a Mixpanel user, you can connect AWS Glue to your Mixpanel account. Then, you can use Mixpanel as a data source in your ETL jobs. Run these jobs to transfer data between Mixpanel and AWS services or other supported applications.

**Topics**
+ [

# AWS Glue support for Mixpanel
](Mixpanel-support.md)
+ [

# Policies containing the API operations for creating and using connections
](mixpanel-configuring-iam-permissions.md)
+ [

# Configuring Mixpanel
](mixpanel-configuring.md)
+ [

# Configuring Mixpanel connections
](mixpanel-configuring-connections.md)
+ [

# Reading from Mixpanel entities
](mixpanel-reading-from-entities.md)
+ [

# Mixpanel connection options
](mixpanel-connection-options.md)
+ [

# Creating a Mixpanel account and configuring the client app
](mixpanel-create-account.md)
+ [

# Limitations
](mixpanel-connector-limitations.md)

# AWS Glue support for Mixpanel


AWS Glue supports Mixpanel as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Mixpanel.

**Supported as a target?**  
No.

**Supported Mixpanel API versions**  
 2.0 

# Policies containing the API operations for creating and using connections
IAM policies

 The following sample policy describes the required AWS permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the preceding method, alternatively, use the following managed IAM policies:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring Mixpanel


Before you can use AWS Glue to transfer from Mixpanel, you must meet these requirements:

## Minimum requirements

+  You have a Mixpanel account. For more information about creating an account, see [Creating a Mixpanel account](mixpanel-create-account.md). 
+  Your Mixpanel account is enabled for API access. API access is enabled by default for the Enterprise, Unlimited, Developer, and Performance editions. 

If you meet these requirements, you’re ready to connect AWS Glue to your Mixpanel account. For typical connections, you don't need do anything else in Mixpanel.

# Configuring Mixpanel connections


Mixpanel supports username and password for `BasicAuth`. Basic Authentication is a simple authentication method where clients provide credentials directly to access protected resources. AWS Glue is able to use the username and password to authenticate Mixpanel APIs. 

For public Mixpanel documentation about `BasicAuth` flow, see [ Mixpanel Service Accounts ](https://developer.mixpanel.com/reference/service-accounts). 

To configure a Mixpanel connection:

1. In AWS Secrets Manager, create a secret with the following details: 
   +  For Basic Authentication, Secret should contain the connected app Consumer Secret with `USERNAME` and `PASSWORD` as key. 
**Note**  
It is a must to create a secret per connection in AWS Glue.

1. In the AWS Glue Studio, create a connection under **Data Connections** by following the steps below: 

   1. When selecting a **Connection type**, select **Mixpanel**.

   1. Provide the `INSTANCE_URL` of the Mixpanel that you want to connect to.

   1. Select the IAM role for which AWS Glue can assume and has permissions for the following actions: 

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1.  Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. 

   1.  Select **Network options** if you want to use your network. 

1.  Grant the IAM role associated with your AWS Glue job permission to read `secretName`. 

# Reading from Mixpanel entities


 **Prerequisites** 

You must have a Mixpanel object, such as `Funnels`, `Retention`, or `Retention Funnels`, from which you would like to read data. Additionally, you will need to know the object name.

 **Supported entities** 


| Entity | Can be Filtered | Supports Limit | Supports Order By | Supports Select \$1 | Supports Partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Funnels | Yes | No | No | Yes | No | 
| Retention | Yes | No | No | Yes | No | 
| Segmentation | Yes | No | No | Yes | No | 
| Segmentation Sum | Yes | No | No | Yes | No | 
| Segmentation Average | Yes | No | No | Yes | No | 
| Cohorts | Yes | No | No | Yes | No | 
| Engage | No | Yes | No | Yes | No | 
| Events | Yes | No | No | Yes | No | 
| Events Top | Yes | No | No | Yes | No | 
| Events Names | Yes | No | No | Yes | No | 
| Events Properties | Yes | No | No | Yes | No | 
| Events Properties Top | Yes | No | No | Yes | No | 
| Events Properties Values | Yes | No | No | Yes | No | 
| Annotations | Yes | No | No | Yes | No | 
| Profile Event Activity | Yes | No | No | Yes | No | 

 **Example** 

```
mixpanel_read = glueContext.create_dynamic_frame.from_options(
    connection_type="mixpanel",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "/cohorts/list?project_id=2603353",
        "API_VERSION": "2.0",
        "INSTANCE_URL": "https://www.mixpanel.com/api/app/me"
    }
```

 **Mixpanel entity and field details** 

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/mixpanel-reading-from-entities.html)

# Mixpanel connection options


The following are connection options for Mixpanel:
+  `ENTITY_NAME` (String) – (Required) Used for Read/Write. The name of your Object in Mixpanel. 
+  `API_VERSION` (String) – (Required) Used for Read/Write. Mixpanel Rest API version you want to use. For example: v2.0. 
+  `SELECTED_FIELDS`(List<String>) – Default: empty (SELECT \$1). Used for Read. Columns you want to select for the object. 
+  `FILTER_PREDICATE`(String) – Default: empty. Used for Read. It should be in the Spark SQL format. 
+  `QUERY`(String) – Default: empty. Used for Read. Full Spark SQL query. 

# Creating a Mixpanel account and configuring the client app


**Creating a Mixpanel account**

1. Navigate to the [Mixpanel home page/](https://mixpanel.com/). 

1. On the **Mixpanel** home page, choose **Sign Up** at the upper-right corner of the page. 

1. On the **Let's get started** page, complete the following actions: 
   + Enter your email address in the designated field.
   + Select the required checkbox to agree to the terms.
   + Choose **Get Started** to proceed.

     Upon successful completion, you will receive a verification email. 

1. Check your email inbox for a verification message, open the email, and follow the instructions to verify your email address. 

1. On the verification page, choose **Verify Email ** to complete your email verification. 

1. On the **Name Your Organization** page, enter your organization name and choose **Next**. 

1. On the **Your First Project** page, enter your project details and choose **Create**.

1. On the next page, choose **Let's get Started** to complete the creation of your account. 

**Logging into a Mixpanel account**

1. Navigate to the [Mixpanel login page/](https://mixpanel.com/login/). 

1. Enter your email address and choose **Continue**. 

1. Check your email inbox for a verification message, open the email, and follow the instructions to verify your email address. 

1. On the next page, choose **Log In button** to log in to your account. 

**Purchasing a Mixpanel plan**

1. On the Mixpanel page, select the **Settings** icon located in the upper-right corner of the page.

1. From the list of options, select **Plan Details and Billing**. 

1. On the **Plan Details and Billing** page, select **Upgrade or Modify**.

1. In the next page, select the plan that you want to purchase.

   This completes the account creation and plan purchasing process.

**Creating a username and client secret (To register your app)**

1. On the Mixpanel page, select the **Settings** icon located in the upper-right corner of the page." 

1. From the list of options, select **Project Settings**. 

1. On the **Project Settings** page, select **Service Accounts** and then select **Add Service Account**.

1. From the **Service Account** dropdown list, select the **service Account or enter name to create**, add **Project Role**, specify **expires**, and select **Add**. 
**Important**  
After completing the previous step, the following page displays the service account's secret key. Ensure to save the service account's secret key. You will not be able to access it again after this point.

# Limitations


The following are limitations for the Mixpanel connector:
+ For `Segmentation Numeric` entity, the Mixpanel API throws a `400 (Bad Request)` error if no numeric data is found for the mandatory filters. We are treating this as an `OK` response to prevent flow failure.
+ The queryable field `limit` has been removed from the supported entities because:
  + It was causing errors due to being interpreted as the SDK's limit feature
  + The filter served no practical purpose
  + Equivalent functionality is now covered by the limit feature implementation
+ Field-based partitioning cannot be supported due to the absence of required operators (`>=`, `<=`, `<`, `>`, `between`) for partitioning from the SaaS platform. Although it supports the `between` operator, the fields for which it supports this operator are non-retrievable. Hence, the criteria for field-based partitioning are not satisfied.
+  As there is no provision to get an 'offset' value for entities that support pagination, it is not possible to support record-based partitioning for Mixpanel.
+ `Cohorts` entity only supports `CreatedDate/Time` field and there is no field to identify `UpdatedDate/Time` as a result `DML_Status` cannot be identified. Also, there is no endpoint to identify deleted records. Hence, CDC cannot be supported.
+  To run a AWS Glue job for the entities mentioned below, mandatory filters are required. Refer to the table below for entity names and their required filters.  
**Entity name and required filters**    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/mixpanel-connector-limitations.html)

# Connecting to Monday


 Monday.com is a versatile work operating system that streamlines project management and team collaboration. It features customizable workflows, visual dashboards, and automation tools to enhance productivity. Users can track tasks, manage resources, and communicate effectively in one integrated platform. 

**Topics**
+ [

# AWS Glue support for Monday
](monday-support.md)
+ [

# Policies containing the API operations for creating and using connections
](monday-configuring-iam-permissions.md)
+ [

# Configuring Monday
](monday-configuring.md)
+ [

# Configuring Monday connections
](monday-configuring-connections.md)
+ [

# Reading from Monday entities
](monday-reading-from-entities.md)
+ [

# Monday connection option reference
](monday-connection-options.md)
+ [

# Limitations
](monday-connector-limitations.md)
+ [

## Create a new Monday account:
](#monday-account-creation)

# AWS Glue support for Monday


AWS Glue supports Monday as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Monday.

**Supported as a target?**  
No.

**Supported Monday API versions**  
 v2. 

# Policies containing the API operations for creating and using connections
IAM policies

 The following sample policy describes the required AWS permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

You can also use the following managed IAM policies to allow access:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring Monday


Before you can use AWS Glue to transfer data from Monday, you must meet these requirements:

## Minimum requirements

+  You have a Monday developer account with Email and Password. For more information, see [Create a new Monday account:](connecting-to-monday.md#monday-account-creation). 
+  Your Monday developer account is enabled for API access. All use of the Monday APIs available at no additional cost within trial period. Once trial period is over you need to buy subscription to create and access data. For more information, see [Monday’s licensing page](https://developer.monday.com/api-reference/reference/about-the-api-reference) for more details. 

 If you meet these requirements, you’re ready to connect AWS Glue to your Monday account. For typical connections, you don't need do anything else in Monday. 

# Configuring Monday connections


Monday supports following two types for authentication mechanism:

1.  OAuth Auth: Monday supports AUTHORIZATION\$1CODE grant type for OAuth2. 
   +  This grant type is considered “three-legged” OAuth as it relies on redirecting users to the third party authorization server to authenticate the user. It is used when creating connections via the AWS Glue Console. The user creating a connection may by default rely on a AWS Glue-owned connected app where they do not need to provide any OAuth related information except for the Monday instanceurl. The AWS Glue Console will redirect the user to Monday where the user must login and allow AWS Glue the requested permissions to access their Monday instance. 
   +  Users should opt to create their own connected app in Monday and provide their own client id and client secret when creating connections through the AWS Glue Console. In this scenario, they will still be redirected to Monday to login and authorize AWS Glue to access their resources. 
   +  This grant type results in a refresh token and access token. The access token is active for one hour, and may be refreshed automatically without user interaction using the refresh token. 
   +  For more information, see [documentation on creating a connected app for AUTHORIZATION\$1CODE OAuth flow](https://developers.Monday.com/docs/api/v1/Oauth). 

1.  Custom Auth: 
   +  For public Monday documentation on generating the required API keys for custom authorization, see [ https://developer.monday.com/api-reference/docs/authentication\$1api-token-permissions ](https://developer.monday.com/api-reference/docs/authentication#api-token-permissions). 

To configure a Monday connection:

1.  In AWS Secrets Manager, create a secret with the following details. It is required to create a secret for each connection in AWS Glue. 

   1.  For OAuth auth: 
      +  For customer managed connected app - Secret should contain the connected app Consumer Secret with USER\$1MANAGED\$1CLIENT\$1APPLICATION\$1CLIENT\$1SECRET as key. 

   1.  For Custom auth: 
      +  For customer managed connected app - Secret should contain the connected app Consumer Secret with `personalAccessToken` as key. 

1. In AWS Glue Studio, create a connection under **Data Connections** by following the steps below: 

   1.  Under Data Connections, choose **Create connection**. 

   1. When selecting a **Data Source**, select Monday.

   1. Provide your Monday **instanceURL**.

   1.  Select the IAM role that AWS Glue can assume and has permissions for following actions: 

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1.  Select Authentication Type to connect to Monday 
      +  For OAuth auth: Provide the **Token URL** and **User Managed Client Application ClientId ** of the Monday you want to connect to. 
      +  For Custom auth: Select Authentication Type **CUSTOM** to connect to Monday. 

   1.  Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. 

   1.  Select the network options if you want to use your network. 

1.  Grant the IAM role associated with your AWS Glue job permission to read `secretName`. Choose **Next**. 

1.  In your AWS Glue job configuration, provide `connectionName` as an **Additional network connection**. 

# Reading from Monday entities


 **Prerequisites** 
+  A Monday Object you would like to read from. Refer the supported entities table below to check the available entities. 

 **Supported entities for Source** 

 Entities list: 
+  Account: [ https://developer.monday.com/api-reference/docs/account\$1queries ](https://developer.monday.com/api-reference/docs/account#queries) 
+  Board: [ https://developer.monday.com/api-reference/docs/boards\$1queries ](https://developer.monday.com/api-reference/docs/boards#queries) 
+  Column: [ https://developer.monday.com/api-reference/docs/columns\$1queries ](https://developer.monday.com/api-reference/docs/columns#queries) 
+  Docs: [ https://developer.monday.com/api-reference/docs/docs\$1queries ](https://developer.monday.com/api-reference/docs/docs#queries) 
+  Document Block: [ https://developer.monday.com/api-reference/docs/blocks\$1queries ](https://developer.monday.com/api-reference/docs/blocks#queries) 
+  Files: [ https://developer.monday.com/api-reference/docs/files\$1queries ](https://developer.monday.com/api-reference/docs/files#queries) 
+  Folders: [ https://developer.monday.com/api-reference/docs/folders\$1queries ](https://developer.monday.com/api-reference/docs/folders#queries) 
+  Groups: [ https://developer.monday.com/api-reference/docs/groups\$1queries ](https://developer.monday.com/api-reference/docs/groups#queries) 
+  Item: [ https://developer.monday.com/api-reference/docs/items\$1queries ](https://developer.monday.com/api-reference/docs/items#queries) 
+  Subitems: [ https://developer.monday.com/api-reference/docs/subitems\$1queries ](https://developer.monday.com/api-reference/docs/subitems#queries) 
+  Tags: [ https://developer.monday.com/api-reference/docs/tags-queries\$1queries ](https://developer.monday.com/api-reference/docs/tags-queries#queries) 
+  Teams: [ https://developer.monday.com/api-reference/docs/teams\$1queries ](https://developer.monday.com/api-reference/docs/teams#queries) 
+  Updates: [ https://developer.monday.com/api-reference/docs/updates\$1queries ](https://developer.monday.com/api-reference/docs/updates#queries) 
+  Users: [ https://developer.monday.com/api-reference/docs/users\$1queries ](https://developer.monday.com/api-reference/docs/users#queries) 
+  Workspaces: [ https://developer.monday.com/api-reference/docs/workspaces\$1queries ](https://developer.monday.com/api-reference/docs/workspaces#queries) 


| Entity | Can be Filtered | Supports Limit | Supports Order By | Supports Select \$1 | Supports Partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Account | No | No | No | Yes | No | 
| Boards | Yes | Yes | No | Yes | No | 
| Columns | No | No | No | Yes | No | 
| Docs | Yes | Yes | No | Yes | No | 
| Document Blocks | No | Yes | No | Yes | No | 
| Files | Yes | No | No | Yes | No | 
| Groups | No | No | No | Yes | No | 
| Item | Yes | Yes | No | Yes | No | 
| Subitems | No | No | No | Yes | No | 
| Tags | Yes | No | No | Yes | Yes | 
| Teams | Yes | No | No | Yes | No | 
| Updates | No | Yes | No | Yes | No | 
| Users | Yes | Yes | No | Yes | No | 
| Workspaces | Yes | Yes | No | Yes | No | 
| Folders | Yes | Yes | No | Yes | No | 

 **Example** 

```
monday_read = glueContext.create_dynamic_frame.from_options(
     connection_type="monday",
     connection_options={
         "connectionName": "connectionName",
         "ENTITY_NAME": "account",
         "API_VERSION": "v2"
     }
```

# Monday connection option reference


The following are connection options for Monday:
+  `ENTITY_NAME`(String) - (Required) Used for Read/Write. The name of your Object in Monday. 
+  `API_VERSION`(String) - (Required) Used for Read/Write. Monday Rest API version you want to use. Example: v2. 
+  `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object. 
+  `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format. 
+  `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query. 

# Limitations


The following are limitations for the Monday connector:
+  The dynamic metadata response has certain conflicts with the documentation as mentioned below: 
  +  Group, Column entity supports filter operations, but it is not present in the dynamic metadata endpoint, hence it's kept as non-filterable entity. 
  +  The dynamic endpoint consists of around 15000\$1 lines and returns metadata of all the entities in a single response, because of this the fields are taking an average of 10 seconds to load hence, this would require some additional time while running a job. 
  +  Refer the below table for Monday rate limit. The significant size of the dynamic entity's response data causes a noticeable delay, with fields requiring an average of 10 seconds to load.     
<a name="monday-rate-limit-table"></a>[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/monday-connector-limitations.html)

## Create a new Monday account:


1.  Navigate to Monday’s homepage, [https://monday.com/](https://monday.com/) and choose **Login**. 

1.  You will be re-directed to the login page. On the bottom of the page, choose **Sign up**. 

1.  Enter your email address and choose **Continue**. Alternately, you can sign in with Google. 

1.  Enter the required details and choose **Continue**. 

1.  Complete the survey questions and follow the steps to complete the account creation process. 

**Register an OAuth application:**

1.  Log into your monday.com account. Click on your avatar (picture icon) in the bottom left corner of your screen. 

1.  Choose **Developer**. 

1.  Choose **Create app**. 

1.  Complete the required fields for name and description. 

1. Navigate to “OAuth” section present on the right side add the scopes and choose “Save Feature”.

1.  Navigate to “Redirect URLS” tab beside the scope and add the redirect URL and choose “Save Feature”. 

1.  Under the **Redirect URLs** tab, provide the URL of your app. This should be https://\$1region-code\$1.console.aws.amazon.com/appflow/oauth. For example, if you are using `us-east-1 `you can add `https://us-east-1.console.aws.amazon.com/appflow/oauth`. 

1.  The application is now ready to use. You can find your credentials, in the “Basic Information” section. Note your Client ID and Client secret strings. These strings are used to make a connection with this app using an AppFlow connector. 

**Generate personal access token:**

 Currently, monday.com only offers our V2 API tokens, which are all personal tokens. To access your API tokens, you can use one of two methods depending on your user level. Admin users can utilize both methods to acquire their API tokens. Member users can access their API tokens from their Developer tabs. 

 Admins - If you are an admin user on your monday.com account,you can access your API tokens from the "Admin" tab with the following steps: 

1.  Log into your monday.com account. Click on your avatar (picture icon) in the bottom left corner of your screen. 

1.  Select “Administration” from the resulting menu (this requires you to have admin permissions). 

1.  Navigate to the “API”Section and generate a “API V2 Token”. You can copy your token and use it. 

 Developer - If you are a member user on your monday.com account, you can access your API tokens from the Developer tab with the following steps: 

1.  Log into your monday.com account. Click on your avatar (picture icon) in the bottom left corner of your screen. 

1.  Select “Developers” from the resulting menu. 

1.  In the top menu, choose the "Developer" drop-down menu. Select the first option on the drop-down menu titled "My Access Tokens." 

# Connecting to MongoDB in AWS Glue Studio
Connecting to MongoDB

 AWS Glue provides built-in support for MongoDB. AWS Glue Studio provides a visual interface to connect to MongoDB, author data integration jobs, and run them on the AWS Glue Studio serverless Spark runtime. 

**Topics**
+ [

# Creating a MongoDB connection
](creating-mongodb-connection.md)
+ [

# Creating a MongoDB source node
](creating-mongodb-source-node.md)
+ [

# Creating a MongoDB target node
](creating-mongodb-target-node.md)
+ [

## Advanced options
](#creating-mongodb-connection-advanced-options)

# Creating a MongoDB connection


**Prerequisites**:
+ If your MongoDB instance is in an Amazon VPC, configure Amazon VPC to allow your AWS Glue job to communicate with the MongoDB instance without traffic traversing the public internet. 

  In Amazon VPC, identify or create a **VPC**, **Subnet** and **Security group** that AWS Glue will use while executing the job. Additionally, you need to ensure Amazon VPC is configured to permit network traffic between your MongoDB instance and this location. Based on your network layout, this may require changes to security group rules, Network ACLs, NAT Gateways and Peering connections.

**To configure a connection to MongoDB:**

1. Optionally, in AWS Secrets Manager, create a secret using your MongoDB credentials. To create a secret in Secrets Manager, follow the tutorial available in [ Create an AWS Secrets Manager secret ](https://docs.aws.amazon.com//secretsmanager/latest/userguide/create_secret.html) in the AWS Secrets Manager documentation. After creating the secret, keep the Secret name, *secretName* for the next step. 
   + When selecting **Key/value pairs**, create a pair for the key `username` with the value *mongodbUser*.

     When selecting **Key/value pairs**, create a pair for the key `password` with the value *mongodbPass*.

1. In the AWS Glue console, create a connection by following the steps in [Adding an AWS Glue connection](console-connections.md). After creating the connection, keep the connection name, *connectionName*, for future use in AWS Glue. 
   + When selecting a **Connection type**, select **MongoDB** or **MongoDB Atlas**.
   + When selecting **MongoDB URL** or **MongoDB Atlas URL**, provide the hostname of your MongoDB instance.

     A MongoDB URL is provided in the format `mongodb://mongoHost:mongoPort/mongoDBname`.

     A MongoDB Atlas URL is provided in the format `mongodb+srv://mongoHost/mongoDBname`.
   + If you chose to create an Secrets Manager secret, choose the AWS Secrets Manager **Credential type**.

     Then, in **AWS Secret** provide *secretName*.
   + If you choose to provide **Username and password**, provide *mongodbUser* and *mongodbPass*.

1. In the following situations, you may require additional configuration:
   + 

     For MongoDB instances hosted on AWS in an Amazon VPC
     + You will need to provide Amazon VPC connection information to the AWS Glue connection that defines your MongoDB security credentials. When creating or updating your connection, set **VPC**, **Subnet** and **Security groups** in **Network options**.

After creating a AWS Glue MongoDB connection, you will need to perform the following steps before running your AWS Glue job:
+ When working with AWS Glue jobs in the visual editor, you must provide Amazon VPC connection information for your job to connect to MongoDB. Identify a suitable location in Amazon VPC and provide it to your AWS Glue MongoDB connection.
+ If you chose to create an Secrets Manager secret, grant the IAM role associated with your AWS Glue job permission to read *secretName*.

# Creating a MongoDB source node


## Prerequisites needed

+ A AWS Glue MongoDB connection, as described in the previous section, [Creating a MongoDB connection](creating-mongodb-connection.md).
+ If you chose to create an Secrets Manager secret, appropriate permissions on your job to read the secret used by the connection.
+ A MongoDB collection you would like to read from. You will need identification information for the collection.

  A MongoDB collection is identified by a database name and a collection name, *mongodbName*, *mongodbCollection*.

## Adding a MongoDB data source


**To add a **Data source – MongoDB** node:**

1.  Choose the connection for your MongoDB data source. Since you have created it, it should be available in the dropdown. If you need to create a connection, choose **Create MongoDB connection**. For more information see the previous section, [Creating a MongoDB connection](creating-mongodb-connection.md). 

    Once you have chosen a connection, you can view the connection properties by clicking **View properties**. 

1. Choose a **Database**. Enter *mongodbName*.

1. Choose a **Collection**. Enter *mongodbCollection*.

1. Choose your **Partitioner**, **Partition size (MB)** and **Partition key**. For more information about partition parameters, see ["connectionType": "mongodb" as source](aws-glue-programming-etl-connect-mongodb-home.md#etl-connect-mongodb-as-source).

1.  In **Custom MongoDB properties**, enter parameters and values as needed. 

# Creating a MongoDB target node


## Prerequisites needed

+ A AWS Glue MongoDB connection, configured with an AWS Secrets Manager secret, as described in the previous section, [Creating a MongoDB connection](creating-mongodb-connection.md).
+ Appropriate permissions on your job to read the secret used by the connection.
+ A MongoDB table you would like to write to, *tableName*.

## Adding a MongoDB data target


**To add a **Data target – MongoDB** node:**

1.  Choose the connection for your MongoDB data source. Since you have created it, it should be available in the dropdown. If you need to create a connection, choose **Create MongoDB connection**. For more information see the previous section, [Creating a MongoDB connection](creating-mongodb-connection.md). 

    Once you have chosen a connection, you can view the connection properties by clicking **View properties**. 

1. Choose a **Database**. Enter *mongodbName*.

1. Choose a **Collection**. Enter *mongodbCollection*.

1. Choose your **Partitioner**, **Partition size (MB)** and **Partition key**. For more information about partition parameters, see ["connectionType": "mongodb" as source](aws-glue-programming-etl-connect-mongodb-home.md#etl-connect-mongodb-as-source).

1. Choose **Retry Writes** if desired.

1.  In **Custom MongoDB properties**, enter parameters and values as needed. 

## Advanced options


You can provide advanced options when creating a MongoDB node. These options are the same as those available when programming AWS Glue for Spark scripts.

See [MongoDB connection option reference](aws-glue-programming-etl-connect-mongodb-home.md#aws-glue-programming-etl-connect-mongodb). 

# Connecting to Oracle NetSuite
Connecting to Oracle NetSuite

Oracle NetSuite is an all-in-one cloud business management solution that helps organizations operate more effectively by automating core processes and providing real-time visibility into operational and financial performance. With a single, integrated suite of applications for managing accounting, order processing, inventory management, production, supply chain and warehouse operations, Oracle NetSuite gives companies clear visibility into their data and tighter control over their businesses.

**Topics**
+ [

# AWS Glue support for Oracle NetSuite
](oracle-netsuite-support.md)
+ [

# Policies containing the API operations for creating and using connections
](oracle-netsuite-configuring-iam-permissions.md)
+ [

# Configuring Oracle NetSuite
](oracle-netsuite-configuring.md)
+ [

# Configuring Oracle NetSuite connections
](oracle-netsuite-configuring-connections.md)
+ [

# Reading from Oracle NetSuite entities
](oracle-netsuite-reading-from-entities.md)
+ [

# Oracle NetSuite connection options
](oracle-netsuite-connection-options.md)
+ [

# Limitations and notes for Oracle NetSuite connector
](oracle-netsuite-connector-limitations.md)

# AWS Glue support for Oracle NetSuite


AWS Glue supports Oracle NetSuite as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Oracle NetSuite.

**Supported as a target?**  
No.

**Supported Oracle NetSuite API versions**  
The following Oracle NetSuite API versions are supported:
+ v1

For entity support per version specific, see Supported entities for Source.

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring Oracle NetSuite


Before you can use AWS Glue to transfer data from Oracle NetSuite, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have a Oracle NetSuite account. For more information, see [Creating a Oracle NetSuite account](#oracle-netsuite-configuring-creating-oracle-netsuite-account).
+ Your Oracle NetSuite account is enabled for API access.
+ You have created an OAuth 2.0 API integration in your Oracle NetSuite developer account. This integration provides the client credentials that AWS Glue uses to access your data securely when it makes authenticated calls to your account. For more information, see [Creating a Oracle NetSuite client app and OAuth 2.0 credentials](#oracle-netsuite-configuring-creating-oracle-netsuite-client-app).

If you meet these requirements, you’re ready to connect AWS Glue to your Oracle NetSuite account.

## Creating a Oracle NetSuite account


Navigate to [Oracle NetSuite](https://www.netsuite.com/portal/home.shtml), and choose **Free Product Tour**. Fill in the required details to get a free product tour, through which you can contact a vendor. The process for procuring an account is as follows:
+ The procurement of a NetSuite account is done via a vendor, who provides a form/quote which has to be legally reviewed.
+ The account to be procured for Oracle NetSuite connector is of **Standard Cloud Service**.
+ This account is created by the vendor and temporary credentials are shared by them. You will receive a welcome mail from NetSuite <billing@notification.netsuite.com> <system@sent-via.netsuite.com> with the details such as your username, and a link to set your password.
+ Use the **Set your password** link to set the password for the username provided by the vendor.

## Creating a Oracle NetSuite client app and OAuth 2.0 credentials


To get the Client ID and Client Secret you create a Oracle NetSuite client app:

1. Log into your NetSuite account through the [NetSuite customer login](https://system.netsuite.com/pages/customerlogin.jsp).

1. Choose **Setup** > **Company** > **Enable features**.

1. Navigate to the **SuiteCloud** section and select the **REST WEB SERVICES** checkbox under **SuiteTalk (Web Services)**.

1. Select the **OAUTH 2.0** checkbox under **Manage Authentication**. Click **Save**.

1. Go to **Setup** > **Integration** > **Manage Integrations** and choose **New** to create an OAuth2.0 application.

1. Enter a name of your choice and keep the **STATE** as Enabled.

1. If checked, uncheck the **TBA: AUTHORIZATION FLOW** and **TOKEN-BASED AUTHENTICATION** checkboxes displayed under **Token-based Authentication**.

1. Select the **AUTHORIZATION CODE GRANT** and **PUBLIC CLIENT** checkboxes under **OAuth 2.0**.

1. Under Auth, note the Client ID and Client Secret.

1. Enter a **REDIRECT URI**. For example, https://us-east-1.console.aws.amazon.com/gluestudio/oauth

1. Select the **REST WEB SERVICES** checkbox under **SCOPE**.

1. Select the **USER CREDENTIALS** checkbox under **User Credentials**. Choose **Save**.

1. Note the CONSUMER KEY/CLIENT ID and CONSUMER SECRET/CLIENT SECRET under **Client Credentials**. These values are displayed only once.

1. Create an ADMINISTRATOR role if needed by navigating to **User/Roles** > **Manage Roles** > **New**.

1. While creating a custom role, add full access under the **Permissions** tab for the following entities/functionalities:
   + "Deposit", "Items", "Item Fulfillment", "Make Journal Entry", "Purchase Order", "Subsidiaries", "Vendors", "Bills", "Vendor Return Authorization", "Track Time", "Customer Payment", "Custom Record Entries", "Custom Record Types", "REST Web Services", "OAuth 2.0 Authorized Applications Management", "Custom Entity Fields", "Log in using OAuth 2.0 Access Tokens".

For more information see [OAuth 2.0](https://docs.oracle.com/en/cloud/saas/netsuite/ns-online-help/chapter_157769826287.html) in the NetSuite Applications Suite documentation.

# Configuring Oracle NetSuite connections


Oracle NetSuite supports the AUTHORIZATION\$1CODE grant type for OAuth2. The grant type determines how AWS Glue communicates with Oracle NetSuite to request access to your data.
+ This grant type is considered "three-legged" OAuth as it relies on redirecting users to a third-party authorization server to authenticate the user. It is used when creating connections via the AWS Glue console. The user creating a connection may by default rely on an AWS Glue-owned connected app (AWS Glue managed client application) where they do not need to provide any OAuth-related information except for their Oracle NetSuite instance URL. The AWS Glue console will redirect the user to Oracle NetSuite where the user must log in and allow AWS Glue the requested permissions to access their Oracle NetSuite instance.
+ Users may still opt to create their own connected app in Oracle NetSuite and provide their own client id and client secret when creating connections through the AWS Glue console. In this scenario, they will still be redirected to Oracle NetSuite to login and authorize AWS Glue to access their resources.
+ This grant type results in a refresh token and access token. The access token is short lived, and may be refreshed automatically without user interaction using the refresh token.
+ For public Oracle NetSuite documentation on creating a connected app for Authorization Code OAuth flow, see [Public apps](https://developers.oracle-netsuite.com/docs/api/creating-an-app).

To configure a Oracle NetSuite connection:

1. In AWS Secrets Manager, create a secret with the following details:

   1. For the customer managed connected app, the Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key.

   1. Note: You must create a secret for your connection in AWS Glue.

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Connection type**, select Oracle NetSuite.

   1. Provide the Oracle NetSuite environment.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

# Reading from Oracle NetSuite entities


**Prerequisite**

A Oracle NetSuite object you would like to read from. You will need the object name such as `deposit` or `timebill`. The following table shows the supported entities.

**Supported entities for source**:


| Entity | Can be filtered | Supports Order By | Supports Limit | Supports SELECT \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Deposit | Yes | No | Yes | Yes | Yes | 
| Description Item | Yes | No | Yes | Yes | Yes | 
| Inventory Item | Yes | No | Yes | Yes | Yes | 
| Item Fulfillment | Yes | No | Yes | Yes | Yes | 
| Item Group | Yes | No | Yes | Yes | Yes | 
| Journal Entry | Yes | No | Yes | Yes | Yes | 
| Non-Inventory Purchase Item | Yes | No | Yes | Yes | Yes | 
| Non-Inventory Resale Item | Yes | No | Yes | Yes | Yes | 
| Non-Inventory Sale Item | Yes | No | Yes | Yes | Yes | 
| Purchase Order | Yes | No | Yes | Yes | Yes | 
| Subsidiary | Yes | No | Yes | Yes | Yes | 
| Vendor | Yes | No | Yes | Yes | Yes | 
| Vendor Bill | Yes | No | Yes | Yes | Yes | 
| Vendor Return Authorization | Yes | No | Yes | Yes | Yes | 
| Time Bill | Yes | No | Yes | Yes | Yes | 
| Customer Payment | Yes | No | Yes | Yes | Yes | 
| Fulfillment Request | Yes | No | Yes | Yes | Yes | 
| Item | Yes | Yes | Yes | Yes | Yes | 
| Transaction Line | Yes | Yes | Yes | Yes | Yes | 
| Transaction Accounting Line | Yes | Yes | Yes | Yes | Yes | 
| Custom Record Types (Dynamic) | Yes | Yes | Yes | Yes | Yes | 

**Example**:

```
netsuiteerp_read = glueContext.create_dynamic_frame.from_options(
    connection_type="netsuiteerp",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "deposit",
        "API_VERSION": "v1"
    }
)
```

**Oracle NetSuite entity and field details**:

Oracle NetSuite dynamically loads available fields under selected entity. Depending on the data type of the field, it supports the following filter operators.


| Field data type | Supported filter operators | 
| --- | --- | 
| String | LIKE, =, \$1= | 
| Date | BETWEEN, =, <, <=, >, >= | 
| DateTime | BETWEEN, <, <=, >, >= | 
| Numeric |  =, \$1=, <, <=, >, >= | 
| Boolean |  =, \$1= | 

**Expected input format for Boolean values in Filter Expression**:


| Entity | Boolean "true" value format | Boolean "false" value format | Example | 
| --- | --- | --- | --- | 
| Item, Transaction Line, Transaction Accounting Line, and Custom Record Type entities | T or t | F or f | isinactive = "T" or isinactive = "t" | 
| All other entities | true | false | isinactive = true | 

## Partitioning queries


**Field-based partitioning**

The Oracle NetSuite connector has dynamic metadata so that supported fields for field based partitioning are chosen dynamically. Field based partitioning is supported on fields having the data type Integer, BigInteger, Date or DateTime.

You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an **inclusive** lower bound value of the chosen partition field.

  For the timestamp field, we accept the Spark timestamp format used in Spark SQL queries.

  Examples of valid values:

  ```
  "TIMESTAMP \"1707256978123\""
  "TIMESTAMP \"1702600882\""
  "TIMESTAMP '2024-02-06T22:00:00:00.000Z'"
  "TIMESTAMP '2024-02-06T22:00:00:00Z'"
  "TIMESTAMP '2024-02-06'"
  ```
+ `UPPER_BOUND`: an **exclusive** upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`: the number of partitions.

Example:

```
netsuiteerp_read = glueContext.create_dynamic_frame.from_options(
    connection_type="netsuiteerp",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "deposit",
        "API_VERSION": "v1",
        "PARTITION_FIELD": "id",
        "LOWER_BOUND": "1",
        "UPPER_BOUND": "10000",
        "NUM_PARTITIONS": "10"
    }
```

**Record-based partitioning**

You can provide the additional Spark option `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With this parameter, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.

In record based partitioning, the total number of records present is queried from Oracle NetSuite API, and it is divided by the `NUM_PARTITIONS` number provided, the resulting number of records are then concurrently fetched by each sub-query.
+ `NUM_PARTITIONS`: the number of partitions.

Example:

```
netsuiteerp_read = glueContext.create_dynamic_frame.from_options(
    connection_type="netsuiteerp",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "deposit",
        "API_VERSION": "v1",
        "NUM_PARTITIONS": "3"
    }
```

# Oracle NetSuite connection options


The following are connection options for Oracle NetSuite:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of the Oracle NetSuite entity. Example: deposit.
+ `API_VERSION`(String) - (Required) Used for Read. Oracle NetSuite Rest API version you want to use. The value will be v1, as Oracle NetSuite currently supports only version v1.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Comma-separated list of columns you want to select for the selected entity.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition the query (field-based partitioning).
+ `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field, used in field-based partitioning.
+ `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field, used in field-based partitioning. 
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read. Used in both field- and record- based partitioning.
+ `INSTANCEE_URL`(String) - A valid NetSuite instance URL with format https://\$1account-id\$1.suitetalk.api.netsuite.com.

# Limitations and notes for Oracle NetSuite connector


The following are limitations or notes for the Oracle NetSuite connector:
+ The values of the access\$1token and refresh\$1token parameters are in JSON Web Token (JWT) format. The access token is valid for 60 minutes whereas the refresh\$1token is valid for seven days.
+ During client ID and client secret generation, if you select "PUBLIC CLIENT" along with "AUTHORIZATION CODE GRANT", then the refresh token is only valid for three hours and is for one-time use only.
+ You can fetch at most 1,00,000 records using the connector. For more information, refer to [Executing SuiteQL Queries Through REST Web Services](https://docs.oracle.com/en/cloud/saas/netsuite/ns-online-help/section_157909186990.html).
+ Partitions are created such that each partition will fetch records in multiples of 1000, except possibly the last one which will fetch the remaining records.
+ For Item, Transaction Line and Transaction Accounting Line objects, the connector will not support a few operators due to following reasons:
  + Applying the `EQUAL_TO`, `NOT_EQUAL_TO` filter operators to fields of type Date gives unreliable results.
  + Applying the `LESS_THAN_OR_EQUAL_TO` filter operator to fields of type Date gives unreliable results and behaves similar to the `LESS_THAN` operator.
  + Applying the `GREATER_THAN` filter operator to fields of type Date= gives unreliable results and behaves similar to `GREATER_THAN_OR_EQUAL_TO` operator.
+ For Item, Transaction Line, Transaction Accounting Line, and Custom Record Type objects, boolean values come in the format T/F instead of the standard true/false. The connector maps the t/f values to true/false to ensure consistency in data.

# Connecting to OpenSearch Service in AWS Glue Studio
Connecting to OpenSearch Service

 AWS Glue provides built-in support for Amazon OpenSearch Service. AWS Glue Studio provides a visual interface to connect to Amazon OpenSearch Service, author data integration jobs, and run them on the AWS Glue Studio serverless Spark runtime. This feature is not compatible with OpenSearch Service serverless. 

 AWS Glue Studio creates a unified connection for Amazon OpenSearch Service. For more information, see [Considerations](using-connectors-unified-connections.md#using-connectors-unified-connections-considerations). 

**Topics**
+ [

# Creating a OpenSearch Service connection
](creating-opensearch-connection.md)
+ [

# Creating a OpenSearch Service source node
](creating-opensearch-source-node.md)
+ [

# Creating a OpenSearch Service target node
](creating-opensearch-target-node.md)
+ [

## Advanced options
](#creating-opensearch-connection-advanced-options)

# Creating a OpenSearch Service connection


**Prerequisites**:
+ Identify the domain endpoint, *aosEndpoint* and port, *aosPort* you would like to read from, or create the resource by following instructions in the Amazon OpenSearch Service documentation. For more information on creating a domain, see [Creating and managing Amazon OpenSearch Service domains](https://docs.aws.amazon.com//opensearch-service/latest/developerguide/createupdatedomains.html) in the Amazon OpenSearch Service documentation.

  An Amazon OpenSearch Service domain endpoint will have the following default form, https://search-*domainName*-*unstructuredIdContent*.*region*.es.amazonaws.com. For more information on identifying your domain endpoint, see [Creating and managing Amazon OpenSearch Service domains](https://docs.aws.amazon.com//opensearch-service/latest/developerguide/createupdatedomains.html) in the Amazon OpenSearch Service documentation. 

  Identify or generate HTTP basic authentication credentials, *aosUser* and *aosPassword* for your domain.

**To configure a connection to OpenSearch Service:**

1. In AWS Secrets Manager, create a secret using your OpenSearch Service credentials. To create a secret in Secrets Manager, follow the tutorial available in [ Create an AWS Secrets Manager secret ](https://docs.aws.amazon.com//secretsmanager/latest/userguide/create_secret.html) in the AWS Secrets Manager documentation. After creating the secret, keep the Secret name, *secretName* for the next step. 
   + When selecting **Key/value pairs**, create a pair for the key `USERNAME` with the value *aosUser*.
   + When selecting **Key/value pairs**, create a pair for the key `PASSWORD` with the value *aosPassword*.

1. In the AWS Glue console, create a connection by following the steps in [Adding an AWS Glue connection](console-connections.md). After creating the connection, keep the connection name, *connectionName*, for future use in AWS Glue. 
   + When selecting a **Connection type**, select OpenSearch Service.
   + When selecting a Domain endpoint, provide *aosEndpoint*.
   + When selecting a port, provide *aosPort*.
   + When selecting an **AWS Secret**, provide *secretName*.

# Creating a OpenSearch Service source node


## Prerequisites needed

+ A AWS Glue OpenSearch Service connection, configured with an AWS Secrets Manager secret, as described in the previous section, [Creating a OpenSearch Service connection](creating-opensearch-connection.md).
+ Appropriate permissions on your job to read the secret used by the connection.
+ A OpenSearch Service index you would like to read from, *aosIndex*.

## Adding a OpenSearch Service data source


**To add a **Data source – OpenSearch Service** node:**

1.  Choose the connection for your OpenSearch Service data source. Since you have created it, it should be available in the dropdown. If you need to create a connection, choose **Create OpenSearch Service connection**. For more information see the previous section, [Creating a OpenSearch Service connection](creating-opensearch-connection.md). 

    Once you have chosen a connection, you can view the connection properties by clicking **View properties**. 

1. Provide **Index**, the index you would like to read.

1. Optionally, provide **Query**, an OpenSearch query to deliver more specific results. For more information about writing OpenSearch queries, consult [Reading from OpenSearch Service indexes](aws-glue-programming-etl-connect-opensearch-home.md#aws-glue-programming-etl-connect-opensearch-read).

1.  In **Custom OpenSearch Service properties**, enter parameters and values as needed. 

# Creating a OpenSearch Service target node


## Prerequisites needed

+ A AWS Glue OpenSearch Service connection, configured with an AWS Secrets Manager secret, as described in the previous section, [Creating a OpenSearch Service connection](creating-opensearch-connection.md).
+ Appropriate permissions on your job to read the secret used by the connection.
+ A OpenSearch Service index you would like to write to, *aosIndex*.

## Adding a OpenSearch Service data target


**To add a **Data target – OpenSearch Service** node:**

1.  Choose the connection for your OpenSearch Service data source. Since you have created it, it should be available in the dropdown. If you need to create a connection, choose **Create OpenSearch Service connection**. For more information see the previous section, [Creating a OpenSearch Service connection](creating-opensearch-connection.md). 

    Once you have chosen a connection, you can view the connection properties by clicking **View properties**. 

1. Provide **Index**, the index you would like to read.

1.  In **Custom OpenSearch Service properties**, enter parameters and values as needed. 

## Advanced options


You can provide advanced options when creating a OpenSearch Service node. These options are the same as those available when programming AWS Glue for Spark scripts.

See [OpenSearch Service connections](aws-glue-programming-etl-connect-opensearch-home.md). 

# Connecting to Okta


 The Okta API is the programmatic interface to Okta, used for managing large or complex Okta accounts and campaigns. If you're a Okta user, you can connect AWS Glue to your Okta account. Then, you can use Okta as a data source in your ETL jobs. Run these jobs to transfer data between Okta and AWS services or other supported applications. 

**Topics**
+ [

# AWS Glue support for Okta
](okta-support.md)
+ [

# Policies containing the API operations for creating and using connections
](okta-configuring-iam-permissions.md)
+ [

# Configuring Okta
](okta-configuring.md)
+ [

# Configuring Okta connections
](okta-configuring-connections.md)
+ [

# Reading from Okta entities
](okta-reading-from-entities.md)
+ [

# Okta connection option reference
](okta-connection-options.md)
+ [

# Okta New Account and Developer App creation steps
](okta-create-account.md)
+ [

# Limitations
](okta-connector-limitations.md)

# AWS Glue support for Okta


AWS Glue supports Okta as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Okta.

**Supported as a target?**  
No.

**Supported Okta API versions**  
 v1. 

# Policies containing the API operations for creating and using connections
IAM policies

 The following sample policy describes the required AWS permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

You can also use the following managed IAM policies to allow access:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring Okta


Before you can use AWS Glue to transfer data to or from Okta, you must meet these requirements:

## Minimum requirements

+  You have a Okta account. For more information on creating an account, see [Okta New Account and Developer App creation steps](okta-create-account.md). 
+  Your Okta account is enabled for API access. 
+  You have created a OAuth2 API integration in your Okta account. This integration provides the client credentials that AWS Glue uses to access your data securely when it makes authenticated calls to your account. For more information, refer Steps to Create a Client app and OAuth2.0 credentials: Okta New Account and Developer App Creation Steps 
+  You have a Okta account with a OktaApiToken. Refer to [ Okta documentation ](https://developer.okta.com/docs/guides/create-an-api-token/main/#create-the-token). 

 If you meet these requirements, you’re ready to connect AWS Glue to your Okta account. For typical connections, you don't need do anything else in Okta. 

# Configuring Okta connections


 Okta supports two types of authentication mechanisms: 
+  OAuth auth: Okta supports the `AUTHORIZATION_CODE` grant type. 
  +  This grant type is considered “three-legged” OAuth as it relies on redirecting users to the third party authorization server to authenticate the user. It is used when creating connections via the AWS Glue Console. The AWS Glue Console will redirect the user to Okta where the user must login and allow AWS Glue the requested permissions to access their Okta instance. 
  +  Users may opt to create their own connected app in Okta and provide their own client ID and client secret when creating connections through the AWS Glue Console. In this scenario, they will still be redirected to Okta to login and authorize AWS Glue to access their resources. 
  +  This grant type results in a refresh token and access token. The access token is short lived, and may be refreshed automatically without user interaction using the refresh token. 
  +  For more information, see [ public Okta documentation on creating a connected app for Authorization Code OAuth flow ](https://developers.google.com/workspace/guides/create-credentials). 
+  Custom auth: 
  +  For public Okta documentation on generating the required API keys for custom authorization, see [ Okta documentation ](https://developer.okta.com/docs/guides/create-an-api-token/main/#create-the-token). 

To configure an Okta connection:

1.  In AWS Secrets Manager, create a secret with the following details. It is required to create a secret for each connection in AWS Glue. 

   1.  For OAuth auth: 
      +  For customer managed connected app – Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key. 

   1.  For Custom auth: 
      +  For customer managed connected app - Secret should contain the connected app Consumer Secret with `OktaApiToken` as key. 

1. In AWS Glue Studio, create a connection under **Data Connections** by following the steps below: 

   1.  Under Connections, choose **Create connection**. 

   1. When selecting a **Data Source**, select Okta.

   1. Provide your Okta subdomain.

   1. Select the Okta Domain URL of your Okta account.

   1.  Select the IAM role which AWS Glue can assume and has permissions for following actions: 

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1.  Select Authentication Type to connect to data source. 

   1.  For OAuth2 authentication type, provide the **User Managed Client Application ClientId** of the Okta app. 

   1.  Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. 

   1.  Select the network options if you want to use your network. 

1.  Grant the IAM role associated with your AWS Glue job permission to read `secretName`. 

1.  In your AWS Glue job configuration, provide `connectionName` as an **Additional network connection**. 

# Reading from Okta entities


 **Prerequisites** 
+  A Okta Object you would like to read from. Refer the supported entities table below to check the available entities. 

 **Supported entities** 


| Entity | Can be Filtered | Supports Limit | Supports Order By | Supports Select \$1 | Supports Partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Applications | Yes | Yes | No | Yes | No | 
| Devices | Yes | Yes | No | Yes | Yes | 
| Groups | Yes | Yes | Yes | Yes | Yes | 
| Users | Yes | Yes | Yes | Yes | Yes | 
| User Types | No | No | No | Yes | No | 

 **Example** 

```
okta_read = glueContext.create_dynamic_frame.from_options(
    connection_type="Okta",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "applications",
        "API_VERSION": "v1"
    }
```

 **Okta entity and field details** 

 Entities list: 
+  Application: [ https://developer.okta.com/docs/api/openapi/okta-management/management/tag/Application/ ](https://developer.okta.com/docs/api/openapi/okta-management/management/tag/Application/) 
+  Device: [ https://developer.okta.com/docs/api/openapi/okta-management/management/tag/Device/ ](https://developer.okta.com/docs/api/openapi/okta-management/management/tag/Device/) 
+  Group: [ https://developer.okta.com/docs/api/openapi/okta-management/management/tag/Group/ ](https://developer.okta.com/docs/api/openapi/okta-management/management/tag/Group/) 
+  User: [ https://developer.okta.com/docs/api/openapi/okta-management/management/tag/User/ ](https://developer.okta.com/docs/api/openapi/okta-management/management/tag/User/) 
+  User Type: [ https://developer.okta.com/docs/api/openapi/okta-management/management/tag/UserType/ ](https://developer.okta.com/docs/api/openapi/okta-management/management/tag/UserType/) 

 **Partitioning queries** 

 Additional spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, `NUM_PARTITIONS` can be provided if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by spark tasks concurrently. 
+  `PARTITION_FIELD`: the name of the field to be used to partition query. 
+  `LOWER_BOUND`: an inclusive lower bound value of the chosen partition field. 

   For date, we accept the Spark date format used in Spark SQL queries. Example of valid values: `"2024-02-06"`. 
+  `UPPER_BOUND`: an exclusive upper bound value of the chosen partition field. 
+  `NUM_PARTITIONS`: number of partitions. 

 **Example** 

```
okta_read = glueContext.create_dynamic_frame.from_options(
    connection_type="okta",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "lastUpdated",
        "API_VERSION": "v1",
        "PARTITION_FIELD": "lastMembershipUpdated"
        "LOWER_BOUND": "2022-08-10T10:28:46.000Z"
        "UPPER_BOUND": "2024-08-10T10:28:46.000Z"
        "NUM_PARTITIONS": "10"
    }
```

# Okta connection option reference


The following are connection options for Okta:
+  `ENTITY_NAME`(String) - (Required) Used for Read/Write. The name of your Object in Okta. 
+  `API_VERSION`(String) - (Required) Used for Read/Write. Okta Rest API version you want to use. Example: v1. 
+  `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object. 
+  `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format. 
+  `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query. 
+  `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query. 
+  `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field. 
+  `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field. 
+  `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read. 

# Okta New Account and Developer App creation steps


 Create a developer account on Okta for getting access to the Okta API. A free Okta developer account gives access to most key developer features required for accessing Okt API. 

**To create a developer account on Okta**

1.  Navigate to [https://developer.okta.com/signup/](https://console.cloud.google.com). 

1.  Enter account information email, first name, last name and country/region. Choose ** I’m not a robot** and then, **Signup**. 

1.  A verification mail is sent to your registered mail id. You will receive a link in your email for activating your Okta developer account. Choose **Activate**. 

1.  You will be redirected to password reset page. Enter the new password twice and choose ** Reset password**. 

1.  You will be redirected to your Okta developer account dashboard. 

**To create a client app and OAuth 2.0 credentials**

1.  In the developer dashboard, choose create app integration.   
![\[The screenshot shows the Create OAuth client ID page and the Authorised redirect URIs section. Here, add the URIs and choose ADD URI if needed. Once done, choose CREATE.\]](http://docs.aws.amazon.com/glue/latest/dg/images/create-client-app-step-1.png)

1.  The **Create a new app Integration** window will appear and present various sign-in methods. Select **OIDC –OpenID Connect**. 

1.  Scroll down to the Application type section. Select as a **Web Application** and choose **Next**. 

1.  On the "New Web App Integration" screen, fill following information: 
   + App integration name - Enter the name of the app. 
   + Grant type - Choose **Authorization Code** and **Refresh Token** from the list.
   + Sign-in redirect URIs - Choose **Add URI ** and add `https://{regioncode}.console.aws.amazon.com/appflow/oauth`. For example, if you are using `us-west-2 (Oregon)` you can add `https://us-east-1.console.aws.amazon.com/appflow/oauth`.
   + Controlled Access - Assign the app to your user groups as you require and choose **Save**.

1. Your Client Id and Client Secret is generated.

# Limitations


The following are limitations for the Okta connector:
+  For ‘Applications’ entity only one filter can be applied. If more than 1 filter is applied then 400 Bad Request is return with error summary –‘Invalid Search criteria’. 
+  Order by can be supported with search queries only. For example, ` http://dev-15940405.okta.com/api/v1/groups?search=type e.q. "OKTA_GROUP"&sortBy=lastUpdated&sortOrder=asc ` 

# Connecting to PayPal
Connecting to PayPal

PayPal is a payments system that facilitates online money transfers between parties, such as transfers between customers and online vendors. If you're a PayPal user, your account contains data about your transactions, such as their payers, dates, and statuses. You can use AWS Glue to transfer data from PayPal to certain AWS services or other supported applications.

**Topics**
+ [

# AWS Glue support for PayPal
](paypal-support.md)
+ [

# Policies containing the API operations for creating and using connections
](paypal-configuring-iam-permissions.md)
+ [

# Configuring PayPal
](paypal-configuring.md)
+ [

# Configuring PayPal connections
](paypal-configuring-connections.md)
+ [

# Reading from PayPal entities
](paypal-reading-from-entities.md)
+ [

# PayPal connection options
](paypal-connection-options.md)
+ [

# Limitations and notes for PayPal connector
](paypal-connector-limitations.md)

# AWS Glue support for PayPal


AWS Glue supports PayPal as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from PayPal.

**Supported as a target?**  
No.

**Supported PayPal API versions**  
The following PayPal API versions are supported:
+ v1

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring PayPal


Before you can use AWS Glue to transfer data from PayPal, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have a PayPal account with client credentials.
+ Your PayPal account has API access with a valid license.

If you meet these requirements, you’re ready to connect AWS Glue to your PayPal account. For typical connections, you don't need do anything else in PayPal.

# Configuring PayPal connections


PayPal supports the CLIENT CREDENTIALS grant type for OAuth2.
+ This grant type is considered 2-legged OAuth 2.0 as it is used by clients to obtain an access token outside of the context of a user. AWS Glue is able to use the client ID and client secret to authenticate the PayPal APIs which are provided by custom services that you define.
+ Each custom service is owned by an API-only user which has a set of roles and permissions which authorize the service to perform specific actions. An access token is associated with a single custom service.
+ This grant type results in an access token which is short lived, and may be renewed by calling the `/v2/oauth2/token` endpoint again.
+ For public PayPal documentation for OAuth 2.0 with client credentials, see [Authentication](https://developer.paypal.com/api/rest/authentication/).

To configure a PayPal connection:

1. In AWS Secrets Manager, create a secret with the following details:

   1. For the customer managed connected app, the Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key.

   1. Note: you must create a secret for your connections in AWS Glue.

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Connection type**, select PayPal.

   1. Provide the `INSTANCE_URL` of the PayPal instance you want to connect to.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

## Getting OAuth 2.0 credentials


To call the Rest API, you'll need to exchange your client ID and client secret for an access token. For more information, see [Get started with PayPal REST APIs](https://developer.paypal.com/api/rest/) .

# Reading from PayPal entities


**Prerequisite**

A PayPal object you would like to read from. You will need the object name, `transaction`.

**Supported entities for source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| transaction | Yes | Yes | No | Yes | Yes | 

**Example**:

```
paypal_read = glueContext.create_dynamic_frame.from_options(
    connection_type="paypal",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "transaction",
        "API_VERSION": "v1",
        "INSTANCE_URL": "https://api-m.paypal.com"
    }
```

**PayPal entity and field details**:

Entities with static metadata:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/paypal-reading-from-entities.html)

## Partitioning queries


You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an **inclusive** lower bound value of the chosen partition field.

  For the Datetime field, we accept the value in ISO format.

  Examples of valid value:

  ```
  "2024-07-01T00:00:00.000Z"
  ```
+ `UPPER_BOUND`: an **exclusive** upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`: the number of partitions.

The following field is supported for entity-wise partitioning:


| Entity name | Partitioning fields | Data type | 
| --- | --- | --- | 
| transaction | transaction\$1initiation\$1date | DateTime | 

Example:

```
paypal_read = glueContext.create_dynamic_frame.from_options(
    connection_type="paypal",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "transaction",
        "API_VERSION": "v1",
        "PARTITION_FIELD": "transaction_initiation_date"
        "LOWER_BOUND": "2024-07-01T00:00:00.000Z"
        "UPPER_BOUND": "2024-07-02T00:00:00.000Z"
        "NUM_PARTITIONS": "10"
    }
```

# PayPal connection options


The following are connection options for PayPal:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in PayPal.
+ `API_VERSION`(String) - (Required) Used for Read. PayPal Rest API version you want to use.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query.
+ `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field.
+ `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read.

# Limitations and notes for PayPal connector


The following are limitations or notes for the PayPal connector:
+ The [PayPal transactions documentation](https://developer.paypal.com/docs/api/transaction-search/v1/#search_get) mentions that it takes a maximum of three hours for executed transactions to appear in the list transactions call. However, it has been observed to take more time than that depending the [https://developer.paypal.com/docs/api/transaction-search/v1/#search_get:~:text=last_refreshed_datetime](https://developer.paypal.com/docs/api/transaction-search/v1/#search_get:~:text=last_refreshed_datetime). Here, `last_refreshed_datetime` is the amount of time until which you have data available from the APIs.
+ If the `last_refreshed_datetime` is less than the requested `end_date` then, the `end_date` becomes equal to the `last_refreshed_datetime` as we only have data up until that point.
+ The `transaction_initiation_date` field is a mandatory filter to be provided for the `transaction` entity and the [maximum supported](https://developer.paypal.com/docs/transaction-search/#:~:text=The%20maximum%20supported%20date%20range%20is%2031%20days.) date range for this field is 31 days.
+ When you call the `transaction` entity API request with filters (query parameters) other than the `transaction_initiation_date` field, it is expected that the value of the [https://developer.paypal.com/docs/api/transaction-search/v1/#search_get:~:text=If%20you%20specify%20one%20or%20more%20optional%20query%20parameters%2C%20the%20ending_balance%20response%20field%20is%20empty.](https://developer.paypal.com/docs/api/transaction-search/v1/#search_get:~:text=If%20you%20specify%20one%20or%20more%20optional%20query%20parameters%2C%20the%20ending_balance%20response%20field%20is%20empty.) field won’t be fetched in the response.

# Connecting to Pendo
Connecting to Pendo

Pendo provides a rich data store for user interaction data. Customers will transfer this data to AWS so that they may join it with other product data, perform additional analysis and dash-boarding and set alerts if they choose.

**Topics**
+ [

# AWS Glue support for Pendo
](pendo-support.md)
+ [

# Policies containing the API operations for creating and using connections
](pendo-configuring-iam-permissions.md)
+ [

# Configuring Pendo
](pendo-configuring.md)
+ [

# Configuring Pendo connections
](pendo-configuring-connections.md)
+ [

# Reading from Pendo entities
](pendo-reading-from-entities.md)
+ [

# Pendo connection options
](pendo-connection-options.md)
+ [

# Limitations
](pendo-connector-limitations.md)

# AWS Glue support for Pendo


AWS Glue supports Pendo as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Pendo.

**Supported as a target?**  
No.

**Supported Pendo API versions**  
 v1 

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the preceding method, alternatively, use the following managed IAM policies:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring Pendo


Before you can use AWS Glue to transfer from Pendo, you must meet the following requirements:

## Minimum requirements

+ You have a Pendo account with an `apiKey` with `write access` enabled.
+  Your Pendo account has API access with a valid license. 

If you meet these requirements, you’re ready to connect AWS Glue to your Pendo account. For typical connections, you don't need do anything else in Pendo.

# Configuring Pendo connections


Pendo supports custom authentication.

For public Pendo documentation on generating the required API keys for custom authorization, refer [Authentication – Pendo REST API Documentation](https://engageapi.pendo.io/?bash#getting-started) 

To configure a Pendo connection:

1. In AWS Secrets Manager, create a secret with the following details: 
   + For customer managed connected app - Secret should contain the connected app Consumer Secret with `apiKey` as the key. 
**Note**  
It is a must to create a secret per connection in AWS Glue.

1. In AWS Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Data Source**, select Pendo.

   1. Provide the `instanceUrl` of the Pendo instance you want to connect to.

   1. Select the IAM role for which AWS Glue can assume and has permissions for following actions: 

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. 

   1.  Select the network options if you want to use your network. 

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`. 

1. In your AWS Glue job configuration, provide `connectionName` as an Additional network connection.

# Reading from Pendo entities


 **Prerequisites** 

An Pendo Object you would like to read from. Refer the supported entities table below to check the available entities. 

 **Supported entities** 
+ [Feature](https://developers.pendo.io/docs/?bash#feature)
+ [Guide](https://developers.pendo.io/docs/?bash#guide)
+ [Page](https://developers.pendo.io/docs/?bash#page)
+ [Report](https://developers.pendo.io/docs/?bash#report)
+ [Report Data](https://developers.pendo.io/docs/?bash#return-report-contents-as-array-of-json-objects)
+ [Visitor](https://developers.pendo.io/docs/?bash#visitor)
+ [Account](https://developers.pendo.io/docs/?bash#entities)
+ [Event](https://developers.pendo.io/docs/?bash#events-grouped)
+ [Feature Event](https://developers.pendo.io/docs/?bash#events-grouped)
+ [Guide Event](https://developers.pendo.io/docs/?bash#events-ungrouped)
+ [Page Event](https://developers.pendo.io/docs/?bash#events-grouped)
+ [Poll Event ](https://developers.pendo.io/docs/?bash#events-ungrouped)
+ [Track Event](https://developers.pendo.io/docs/?bash#events-grouped)


| Entity | Can be Filtered | Supports Limit | Supports Order By | Supports Select \$1 | Supports Partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Feature | No | No | No | Yes | No | 
| Guide | No | No | No | Yes | No | 
| Page | No | No | No | Yes | No | 
| Report | No | No | No | Yes | No | 
| Report Data | No | No | No | Yes | No | 
| Visitor (Aggregation API) | Yes | No | Yes | Yes | No | 
| Account (Aggregation API) | Yes | No | Yes | Yes | No | 
| Event (Aggregation API) | Yes | No | Yes | Yes | No | 
| Feature Event (Aggregation API) | Yes | No | Yes | Yes | Yes | 
| Guide Event (Aggregation API) | Yes | No | Yes | Yes | Yes | 
| Account (Aggregation API) | Yes | No | Yes | Yes | Yes | 
| Page Event (Aggregation API) | Yes | No | Yes | Yes | Yes | 
| Poll Event (Aggregation API) | Yes | No | Yes | Yes | Yes | 
| Track Event (Aggregation API) | Yes | No | Yes | Yes | Yes | 

 **Example** 

```
Pendo_read = glueContext.create_dynamic_frame.from_options(
    connection_type="glue.spark.Pendo",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "feature",
        "API_VERSION": "v1",
        "INSTANCE_URL": "instanceUrl"
    }
```

## Partitioning queries


You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an **inclusive** lower bound value of the chosen partition field.

  For the DateTime field, we accept the value in ISO format.

  Example of valid value:

  ```
  "2024-07-01T00:00:00.000Z"
  ```
+ `UPPER_BOUND`: an **exclusive** upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`: the number of partitions.

The following table describes the entity partitioning field support details:


| Entity name | 
| --- | 
| Event | 
|  Feature Event  | 
| Guide Event | 
| Page Event | 
| Poll Event | 
| Track Event | 

Example:

```
pendo_read = glueContext.create_dynamic_frame.from_options(
    connection_type="glue.spark.pendo",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "event",
        "API_VERSION": "v1",
        "INSTANCE_URL": "instanceUrl"
        "NUM_PARTITIONS": "10",
        "PARTITION_FIELD": "appId"
        "LOWER_BOUND": "4656"
        "UPPER_BOUND": "7788"
    }
```

# Pendo connection options


The following are connection options for Pendo:
+  `ENTITY_NAME`(String) – (Required) Used for Read/Write. The name of your Object in Pendo. 
+ `INSTANCE_URL`(String) - (Required) A valid Pendo Instance URL with the following allowed values:
  + [Default](https://app.pendo.io/)
  + [Europe](https://app.eu.pendo.io/)
  + [US1](https://us1.app.pendo.io/)
+ `API_VERSION`(String) - (Required) Used for Read. Pendo Engage Rest API version you want to use. For example: 3.0.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query.
+ `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field.
+ `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field. 
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read.

# Limitations


The following are limitations for the Pendo connector:
+ Pagination is not supported in Pendo.
+ Filtration is supported only by the Aggregate API objects(`Account`, `Event`, `Feature Event`, `Guide Events`, `Page Event`, `Poll Event`, `Track Event`, and `Visitor`)
+ DateTimeRange is mandatory filter parameter for Aggregate API objects (`Event`, `Feature Event`, `Guide Events`, `Page Event`, `Poll Event,` `Track Event`)
+ The dayRange period will be rounded down to the start of the period in the time zone. For example, if provided filter is `2023-01-12T07:55:27.065Z` then this time period will be rounded to the start of period, that is `2023-01-12T00:00:00Z` . 

# Connecting to Pipedrive


 Pipedrive is a sales pipeline CRM designed to help small businesses manage leads, track sales activities and close more deals. Pipedrive enables sales teams in small businesses to: Streamline processes and consolidate sales data in one unified CRM sales tool. If you're a Pipedrive user, you can connect AWS Glue to your Pipedrive account. Then, you can use Pipedrive as a data source in your ETL Jobs. Run these jobs to transfer data between Pipedrive and AWS services or other supported applications. 

**Topics**
+ [

# AWS Glue support for Pipedrive
](pipedrive-support.md)
+ [

# Policies containing the API operations for creating and using connections
](pipedrive-configuring-iam-permissions.md)
+ [

# Configuring Pipedrive
](pipedrive-configuring.md)
+ [

# Configuring Pipedrive connections
](pipedrive-configuring-connections.md)
+ [

# Reading from Pipedrive entities
](pipedrive-reading-from-entities.md)
+ [

# Pipedrive connection option reference
](pipedrive-connection-options.md)
+ [

# Limitations
](pipedrive-connector-limitations.md)

# AWS Glue support for Pipedrive


AWS Glue supports Pipedrive as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Pipedrive.

**Supported as a target?**  
No.

**Supported Pipedrive API versions**  
 v1. 

# Policies containing the API operations for creating and using connections
IAM policies

 The following sample policy describes the required AWS permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

You can also use the following managed IAM policies to allow access:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring Pipedrive


Before you can use AWS Glue to transfer data from Pipedrive, you must meet these requirements:

## Minimum requirements

+  You have a Pipedrive account. 
+  Your Pipedrive account is enabled for API access. 

 If you meet these requirements, you’re ready to connect AWS Glue to your Pipedrive account. For typical connections, you don't need do anything else in Pipedrive. 

# Configuring Pipedrive connections


 Pipedrive supports AUTHORIZATION\$1CODE grant type for OAuth2. 
+  This grant type is considered “three-legged” OAuth as it relies on redirecting users to the third party authorization server to authenticate the user. It is used when creating connections via the AWS Glue Console. The user creating a connection may by default rely on a AWS Glue-owned connected app where they do not need to provide any OAuth related information except for the Pipedrive instanceurl. The AWS Glue Console will redirect the user to Pipedrive where the user must login and allow AWS Glue the requested permissions to access their Pipedrive instance. 
+  Users should opt to create their own connected app in Pipedrive and provide their own client id and client secret when creating connections through the AWS Glue Console. In this scenario, they will still be redirected to Pipedrive to login and authorize AWS Glue to access their resources. 
+  This grant type results in a refresh token and access token. The access token is active for one hour, and may be refreshed automatically without user interaction using the refresh token. 
+  For more information, see [documentation on creating a connected app for AUTHORIZATION\$1CODE OAuth flow](https://developers.pipedrive.com/docs/api/v1/Oauth). 

To configure a Pipedrive connection:

1.  In AWS Secrets Manager, create a secret with the following details. It is required to create a secret for each connection in AWS Glue. 

   1.  For customer managed connected app - Secret should contain the connected app Consumer Secret with USER\$1MANAGED\$1CLIENT\$1APPLICATION\$1CLIENT\$1SECRET as key. 

1. In AWS Glue Studio, create a connection under **Data Connections** by following the steps below: 

   1.  Under Data Connections, choose **Create connection**. 

   1. When selecting a **Data Source**, select Pipedrive.

   1. Provide your Pipedrive **instanceURL**.

   1.  Select the IAM role that AWS Glue can assume and has permissions for following actions: 

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1.  Provide the User Managed Client Application ClientId of the Pipedrive you want to connect to. 

   1.  Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. 

   1.  Select the network options if you want to use your network. 

1.  Grant the IAM role associated with your AWS Glue job permission to read `secretName`. Choose **Next**. 

1.  Provide the **connectionName** and choose **Next**. 

1.  On the next page choose **Create connection**. You will be asked to login into Pipedrive. Provide your username and password and choose **Log In**. 

1.  Once you are logged in, choose **Continue to the App**. Now your connection is ready to be used. 

1.  In your AWS Glue job configuration, provide `connectionName` as an **Additional network connection**. 

# Reading from Pipedrive entities


 **Prerequisites** 
+  A Pipedrive Object you would like to read from. Refer the supported entities table below to check the available entities. 

 **Supported entities** 


| Entity | Can be Filtered | Supports Limit | Supports Order By | Supports Select \$1 | Supports Partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Activities | Yes | Yes | No | Yes | Yes | 
| Activity Type | No | No | No | Yes | No | 
| Call Logs | No | No | No | Yes | No | 
| Currencies | Yes | Yes | No | Yes | No | 
| Deals | Yes | Yes | Yes | Yes | Yes | 
| Leads | Yes | Yes | Yes | Yes | No | 
| Lead Sources | No | Yes | No | Yes | No | 
| Lead Labels | No | No | No | No | No | 
| Notes | Yes | Yes | Yes | Yes | Yes | 
| Organization | Yes | Yes | No | Yes | Yes | 
| Permission Sets | Yes | No | No | Yes | No | 
| Persons | Yes | Yes | Yes | Yes | Yes | 
| Pipelines | No | Yes | No | Yes | No | 
| Products | Yes | Yes | No | Yes | Yes | 
| Roles | No | Yes | No | Yes | No | 
| Stages | Yes | Yes | No | Yes | No | 
| Users | No | No | No | Yes | No | 

 **Example** 

```
pipedrive_read= glueContext.create_dynamic_frame.from_options(
    connection_type="PIPEDRIVE",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "activites",
        "API_VERSION": "v1"
    }
```

 **Pipedrive entity and field details** 

 Entities list: 
+  Activities: [ https://developers.pipedrive.com/docs/api/v1/Activities ](https://developers.pipedrive.com/docs/api/v1/Activities) 
+  Activity Type: [ https://developers.pipedrive.com/docs/api/v1/ActivityTypes ](https://developers.pipedrive.com/docs/api/v1/ActivityTypes) 
+  Call Logs: [ https://developers.pipedrive.com/docs/api/v1/CallLogs ](https://developers.pipedrive.com/docs/api/v1/CallLogs) 
+  Currencies: [ https://developers.pipedrive.com/docs/api/v1/Currencies ](https://developers.pipedrive.com/docs/api/v1/Currencies) 
+  Deals: [ https://developers.pipedrive.com/docs/api/v1/Deals ](https://developers.pipedrive.com/docs/api/v1/Deals) 
+  Leads: [ https://developers.pipedrive.com/docs/api/v1/Leads ](https://developers.pipedrive.com/docs/api/v1/Leads) 
+  Lead Sources: [ https://developers.pipedrive.com/docs/api/v1/LeadSources ](https://developers.pipedrive.com/docs/api/v1/LeadSources) 
+  Lead Labels: [ https://developers.pipedrive.com/docs/api/v1/LeadLabels ](https://developers.pipedrive.com/docs/api/v1/LeadLabels) 
+  Notes: [ https://developers.pipedrive.com/docs/api/v1/Notes ](https://developers.pipedrive.com/docs/api/v1/Notes) 
+  Organizations: [ https://developers.pipedrive.com/docs/api/v1/Organizations ](https://developers.pipedrive.com/docs/api/v1/Organizations) 
+  Permission Sets: [ https://developers.pipedrive.com/docs/api/v1/PermissionSets ](https://developers.pipedrive.com/docs/api/v1/PermissionSets) 
+  Persons: [ https://developers.pipedrive.com/docs/api/v1/Persons ](https://developers.pipedrive.com/docs/api/v1/Persons) 
+  Pipelines: [ https://developers.pipedrive.com/docs/api/v1/Pipelines ](https://developers.pipedrive.com/docs/api/v1/Pipelines) 
+  Products: [ https://developers.pipedrive.com/docs/api/v1/Products ](https://developers.pipedrive.com/docs/api/v1/Products) 
+  Roles: [ https://developers.pipedrive.com/docs/api/v1/Roles ](https://developers.pipedrive.com/docs/api/v1/Roles) 
+  Stages: [ https://developers.pipedrive.com/docs/api/v1/Stages ](https://developers.pipedrive.com/docs/api/v1/Stages) 
+  Users: [ https://developers.pipedrive.com/docs/api/v1/Users ](https://developers.pipedrive.com/docs/api/v1/Users) 


| Entity | Data Type | Supported Operators | 
| --- | --- | --- | 
| Activities, Deals, Notes, Organization, Persons and Products. | Date | '=' | 
|  | Integer | '=' | 
|  | String | '=' | 
|  | Boolean | '=' | 

## Partitioning queries


 In Pipedrive, only one field (due\$1date) from Activities entity supports field-based partitioning. It is a Date field. 

 Additional spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, `NUM_PARTITIONS` can be provided if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by spark tasks concurrently. 
+  `PARTITION_FIELD`: the name of the field to be used to partition query. 
+  `LOWER_BOUND`: an inclusive lower bound value of the chosen partition field. 

   For date, we accept the Spark date format used in Spark SQL queries. Example of valid values: `"2024-02-06"`. 
+  `UPPER_BOUND`: an exclusive upper bound value of the chosen partition field. 
+  `NUM_PARTITIONS`: number of partitions. 

 **Example** 

```
pipedrive_read = glueContext.create_dynamic_frame.from_options(
    connection_type="PIPEDRIVE",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "activites",
        "API_VERSION": "v1",
        "PARTITION_FIELD": "due_date"
        "LOWER_BOUND": "2023-09-07T02:03:00.000Z"
        "UPPER_BOUND": "2024-05-07T02:03:00.000Z"
        "NUM_PARTITIONS": "10"
    }
```

# Pipedrive connection option reference


The following are connection options for Pipedrive:
+  `ENTITY_NAME`(String) - (Required) Used for Read/Write. The name of your Object in Pipedrive. 
+  `API_VERSION`(String) - (Required) Used for Read/Write. Pipedrive Rest API version you want to use. Example: v1. 
+  `INSTANCE_URL`(String) - (Required) URL of the instance where the user wants to run the operations. Example: v1. 
+  `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object. 
+  `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format. 
+  `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query. 
+  `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query. 
+  `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field. 
+  `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field. 
+  `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read. 

# Limitations


The following are limitations for the Pipedrive connector:
+ Pipedrive supports field based partitioning for only one entity (Activities).
+ Pipedrive supports record based partitioning for Activities, Deals, Notes, Persons, Organizations and Products entities.
+ In Deals entity, status field as filter will returned all record if invalid value filter value is used.
+ In Deals entity, ordering with multiple fields is not supported.
+ To obtain performance data, we utilize a local AWS account. However, due to the limitation of refreshing the access token locally, the AWS Glue job for processing 1 GB of data is failing. Consequently, we have optimized the performance test with 179 MB of data, and the above results are based on this optimization. Nevertheless, we have observed that with an increasing number of partitions, the SaaS endpoint is taking more time compared to a single partition. We have consulted with the Pipedrive support team regarding this behavior, and they informed us that Pipedrive is silently throttling requests and delaying the response. Therefore, when running the AWS Glue job with large datasets or calling the same API endpoint multiple times, it may result in a timeout issue due to Pipedrive API's implementation. However, the connector and shim response times are decreasing as expected with an increase in the number of partitions.

# Connecting to Productboard
Connecting to Productboard

Productboard is the product management system that helps product teams get the right products to market, faster. Over 3,000 modern, product-led companies, like Zendesk, UiPath, and Microsoft, use Productboard to understand what users really need, prioritize what to build next, and rally everyone around their roadmap.

**Topics**
+ [

# AWS Glue support for Productboard
](productboard-support.md)
+ [

# Policies containing the API operations for creating and using connections
](productboard-configuring-iam-permissions.md)
+ [

# Configuring Productboard
](productboard-configuring.md)
+ [

# Configuring Productboard connections
](productboard-configuring-connections.md)
+ [

# Reading from Productboard entities
](productboard-reading-from-entities.md)
+ [

# Productboard connection options
](productboard-connection-options.md)
+ [

# Creating an Productboard account
](productboard-create-account.md)
+ [

# Limitations
](productboard-connector-limitations.md)

# AWS Glue support for Productboard


AWS Glue supports Productboard as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Productboard.

**Supported as a target?**  
No.

**Supported Productboard API versions**  
 v1 

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the preceding method, alternatively, use the following managed IAM policies:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring Productboard


Before you can use AWS Glue to transfer from Productboard, you must meet the following requirements:

## Minimum requirements

+ You have an Productboard account with email and password. For more information about creating an account, see [Creating a Productboard account](productboard-create-account.md). 
+  You must have AWS Account created with the service access to AWS Glue. 
+ You have a Productboard account’s authentication details - either JWT Token if one want to use Custom Auth or Client ID and secret if one want to use OAuth2.0.
+ If user wants to use `OAuth2.0`, [Register your application with Productboard](https://app.productboard.com/oauth2/applications/new) and setup the application by following the instructions at, [How to integrate with Productboard via OAuth2 - developer documentation](https://developer.productboard.com/docs/how-to-integrate-with-productboard-via-oauth2-developer-documentation).

If you meet these requirements, you’re ready to connect AWS Glue to your Productboard account. For typical connections, you don't need do anything else in Productboard.

# Configuring Productboard connections


 

Productboard supports custom authentication and `OAuth2.0`. For `OAuth2.0` Productboard supports the `AUTHORIZATION_CODE` grant type.
+ This grant type is considered “three-legged” `OAuth` as it relies on redirecting users to the third party authorization server to authenticate the user. It is used when creating connections via the AWS Glue Console. The user creating a connection may by default rely on a AWS Glue owned connected app where they do not need to provide any `OAuth` related information except for their Productboard Client ID and Client Secret. The AWS Glue Console will redirect the user to Productboard where the user must login and allow AWS Glue the requested permissions to access their Productboard instance.
+ Users may still opt to create their own connected app in Productboard and provide their own Client ID and Client Secret when creating connections through the AWS Glue Console. In this scenario, they will still be redirected to Productboard to login and authorize AWS Glue to access their resources.
+ This grant type results in a refresh token and access token. The access token is short lived, and may be refreshed automatically without user interaction using the refresh token.
+ For public Productboard documentation on creating a connected app for `AUTHORIZATION_CODE OAuth` flow, see [How to integrate with Productboard via OAuth2 - developer documentation](https://developer.productboard.com/docs/how-to-integrate-with-productboard-via-oauth2-developer-documentation). 

To configure a Productboard connection:

1. In AWS Secrets Manager, create a secret with the following details: 
   + For `OAuth` auth – For customer managed connected app: Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key. 
   + For `Custom auth` – For customer managed connected app: Secret should contain the connected app `JWT token` with `access_token` as key. 
**Note**  
It is a must to create a secret per connection in AWS Glue.

1. In AWS Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Data Source**, select Productboard. 

   1. Select the IAM role for which AWS Glue can assume and has permissions for following actions: 

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select Authentication Type to connect to data source:
      + For `OAuth` auth – Provide the `Token URL`, and `User Managed Client Application ClientId` of the Productboard app.

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. 

   1.  Select the network options if you want to use your network. 

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`. 

1. In your AWS Glue job configuration, provide `connectionName` as an Additional network connection.

# Reading from Productboard entities


 **Prerequisites** 

An Productboard Object you would like to read from. Refer the supported entities table below to check the available entities. 

 **Supported entities** 
+ [Abuse-reports ](https://productboard.com/developer/marketing/api/campaign-abuse/)
+ [Automation](https://productboard.com/developer/marketing/api/automation/list-automations/)
+ [Campaigns](https://productboard.com/developer/marketing/api/campaigns/list-campaigns/)
+ [Click-details](https://productboard.com/developer/marketing/api/link-clickers/)
+ [Lists](https://productboard.com/developer/marketing/api/link-clickers/)
+ [Members](https://productboard.com/developer/marketing/api/list-segment-members/)
+ [Open-details](https://productboard.com/developer/marketing/api/list-members/)
+ [Segments](https://productboard.com/developer/marketing/api/list-segments/)
+ [Stores](https://productboard.com/developer/marketing/api/ecommerce-stores/list-stores/)
+ [Unsubscribed](https://productboard.com/developer/marketing/api/unsub-reports/)


| Entity | Can be Filtered | Supports Limit | Supports Order By | Supports Select \$1 | Supports Partitioning | 
| --- | --- | --- | --- | --- | --- | 
|  Features  | Yes | Yes | No | Yes | Yes | 
|  Components  | No | Yes | No | Yes | No | 
|  Products  | No | Yes | No | Yes | No | 
|  Feature Statuses  | No | Yes | No | Yes | Yes | 
|  Custom Field Definitions  | No | Yes | No | Yes | No | 
|  Custom Field Values  | Yes | Yes | No | Yes | No | 

 **Example** 

```
Productboard_read = glueContext.create_dynamic_frame.from_options(
    connection_type="Productboard",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "feature",
        "API_VERSION": "1"
    }
```

 **Productboard entity and field details** 
+ [Features](https://developer.productboard.com/#tag/features)
+ [Components](https://developer.productboard.com/#tag/components)
+ [Feature statuses](https://developer.productboard.com/#tag/statuses)
+ [Products](https://developer.productboard.com/#tag/products)
+ [Custom fields definitions](https://developer.productboard.com/#tag/hierarchyEntitiesCustomFields)
+ [Custom fields values](https://developer.productboard.com/#tag/hierarchyEntitiesCustomFieldsValues)

# Productboard connection options


The following are connection options for Productboard:
+  `ENTITY_NAME`(String) – (Required) Used for Read/Write. The name of your Object in Productboard. 
+ `API_VERSION`(String) - (Required) Used for Read. Productboard Engage Rest API version you want to use. For example: 3.0.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.

# Creating an Productboard account


1. Navigate to the [Productboard sign up page](https://app.productboard.com/), enter your email ID and password, and then choose **Log me in**.

1. In the **Account Name** field, enter the name of your Productboard account, and then select the **I agree to the Privacy Policy check** box.

1. On the **Now create your workspace** page, in the **Workspace URL** field, enter the URL for your new workspace. Then choose **Continue** to proceed to the next page and provide the remaining details.

   This creates your trial account. The trial account is free for 15 days. After the trial period expires, you can purchase a paid plan. Make a note of your email address, password, and workspace URL. You will need this information to access your account in the future."

**Registering an `OAuth2.0` application**

1. Navigate to the [Productboard login page](https://login.productboard.com/?locale=en), enter your email ID and password, and choose **Log in**. 

1. Select the **User** icon in the upper-right corner, and then choose **Account and billing** from the dropdown menu.

1. Select **Extras** and choose **Registered apps** from the dropdown menu.

1. Locate and choose **Register An App**.

1. Enter the following details:
   + **App name** – Name of the app. 
   + **Company / Organization** – Name of your Company or Organization.
   + **App website** – Website of the app.
   + **Redirect URI** – A Redirect URI pattern is a URI path (or comma-separated list of paths) to which Productboard can redirect (if requested) when the login flow is complete. For example, `https://ap-southeast-2\\.console\\.aws\\.amazon\\.com`

1. Choose **Create**. 

1. The **Client ID** and **Client Secret** will now be visible. Copy and save them in a secure location. Then, choose **Done**. 
**Note**  
Your Client ID and Client Secret strings are credentials used to establish a connection with this connector when using AppFlow or AWS Glue.

**Retrieving CustomAuth Credentials**

1. Navigate to the [Productboard login page](https://app.productboard.com/), enter your email ID and password, and choose **Log me in**.

   You will be redirected to the home page.

1. On the home page, navigate to **Workspace Settings** > **Integrations** > **Public APIs** > **Access Token**.
**Note**  
If the **Public APIs** section is not visible, your account might be on the Essentials plan. Access to API tokens requires at least a Pro plan. Plan features and names are subject to change. For more information about the packages, see [Productboard pricing](https://www.productboard.com/pricing/).

1. Choose **\$1** to generate a new token, and make sure to securely store it for future reference.

**Creating `OAuth2.0` credentials**

To utilize `OAuth2.0` authentication with the Productboard connector, you need to register your application on the Productboard platform and generate a `Client ID` and `Client Secret`.

1. Navigate to the [Productboard login page](https://app.productboard.com/), enter your email ID and password, and choose **Log me in**.

1. To register new OAuth2 application with your Productboard account, navigate to [Producboard](to register new OAuth2 application with your Productboard account) page.

1. Complete the required fields and select the necessary scopes for each entity you wish to access. 
**Note**  
You have chosen the following four scopes, which are required for the six supported entities.

1. **Redirect URL** must have the following format: `https://ap-southeast-2\\.console\\.aws\\.amazon\\.com`
**Note**  
The Appflow redirect URLs are subject to change. Once available, please update the redirect URLs for the AWS Glue platform.

1. The **Client ID** and **Client Secret** will now be visible. Copy and save them in a secure location. 

1. You can set up and verify `OAuth2` by following the steps in the [How to Integrate with Productboard via OAuth2 developer](https://developer.productboard.com/docs/how-to-integrate-with-productboard-via-oauth2-developer-documentation) documentation.

# Limitations


The following are limitations for the Productboard connector:
+ Productboard doesn’t support either field based or record based partitioning.

# Connecting to QuickBooks
Connecting to QuickBooks

QuickBooks is a leading accounting application for small and medium-sized businesses. QuickBooks accounting applications date back to the 1980s as one of the first products by Intuit, and accordingly was originally desktop software. Today, QuickBooks offers several accounting and business financial applications as both installable software and cloud-based SaaS software. As a QuickBooks user, you can connect AWS Glue to your QuickBooks account. Then, you can use QuickBooks as a data source in your ETL jobs. Run these jobs to transfer data between QuickBooks and AWS services or other supported applications.

**Topics**
+ [

# AWS Glue support for QuickBooks
](quickbooks-support.md)
+ [

# Policies containing the API operations for creating and using connections
](quickbooks-configuring-iam-permissions.md)
+ [

# Configuring QuickBooks
](quickbooks-configuring.md)
+ [

# Configuring QuickBooks connections
](quickbooks-configuring-connections.md)
+ [

# Reading from QuickBooks entities
](quickbooks-reading-from-entities.md)
+ [

# QuickBooks connection options
](quickbooks-connection-options.md)
+ [

# Limitations and notes for QuickBooks connector
](quickbooks-connector-limitations.md)

# AWS Glue support for QuickBooks


AWS Glue supports QuickBooks as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from QuickBooks.

**Supported as a target?**  
No.

**Supported QuickBooks API versions**  
The following QuickBooks API versions are supported:
+ v3

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring QuickBooks


Before you can use AWS Glue to transfer data from QuickBooks, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have a QuickBooks account.
+ Your QuickBooks account is enabled for API access.

For more information, see the following topics in QuickBooks documentation:
+ [Create an Intuit Account](https://quickbooks.intuit.com/learn-support/en-us/help-article/account-management/create-intuit-user-account/L62kSFEOM_US_en_US)
+ [Create and start developing your app](https://developer.intuit.com/app/developer/qbo/docs/get-started/start-developing-your-app)

If you meet these requirements, you’re ready to connect AWS Glue to your QuickBooks account. For typical connections, you don't need do anything else in QuickBooks.

# Configuring QuickBooks connections


QuickBooks supports the AUTHORIZATION\$1CODE grant type for OAuth2. The grant type determines how AWS Glue communicates with QuickBooks to request access to your data.
+ This grant type is considered "three-legged" OAuth as it relies on redirecting users to a third-party authorization server to authenticate the user. It is used when creating connections via the AWS Glue console.
+ Users may still opt to create their own connected app in QuickBooks and provide their own client ID and client secret when creating connections through the AWS Glue console. In this scenario, they will still be redirected to QuickBooks to login and authorize AWS Glue to access their resources.
+ This grant type results in a refresh token and access token. The access token is short lived, and may be refreshed automatically without user interaction using the refresh token.
+ For public QuickBooks documentation on creating a connected app for Authorization Code OAuth flow, see [Set up OAuth 2.0](https://developer.intuit.com/app/developer/qbo/docs/develop/authentication-and-authorization/oauth-2.0).

To configure a QuickBooks connection:

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Connection type**, select QuickBooks.

   1. Provide the instance URL and company ID of the QuickBooks instance you want to connect to.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

# Reading from QuickBooks entities


**Prerequisite**

A QuickBooks object you would like to read from.

**Supported entities for source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Account | Yes | Yes | Yes | Yes | Yes | 
| Bill | Yes | Yes | Yes | Yes | Yes | 
| Company Info | No | No | No | Yes | No | 
| Customer | Yes | Yes | Yes | Yes | Yes | 
| Employee | Yes | Yes | Yes | Yes | Yes | 
| Estimate | Yes | Yes | Yes | Yes | Yes | 
| Invoice | Yes | Yes | Yes | Yes | Yes | 
| Item | Yes | Yes | Yes | Yes | Yes | 
| Payment | Yes | Yes | Yes | Yes | Yes | 
| Preferences | No | No | No | Yes | No | 
| Profit and Loss | Yes | No | No | Yes | No | 
| Tax Agency | Yes | Yes | Yes | Yes | Yes | 
| Vendors | Yes | Yes | Yes | Yes | Yes | 

**Example**:

```
QuickBooks_read = glueContext.create_dynamic_frame.from_options(
    connection_type="quickbooks",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "Account",
        "API_VERSION": "v3"
    }
```

**QuickBooks entity and field details**:

For more information about the entities and field details see:
+ [Account](https://developer.intuit.com/app/developer/qbo/docs/api/accounting/most-commonly-used/account)
+ [Bill](https://developer.intuit.com/app/developer/qbo/docs/api/accounting/most-commonly-used/bill)
+ [CompanyInfo](https://developer.intuit.com/app/developer/qbo/docs/api/accounting/most-commonly-used/companyinfo)
+ [Customer](https://developer.intuit.com/app/developer/qbo/docs/api/accounting/most-commonly-used/customer)
+ [Employee](https://developer.intuit.com/app/developer/qbo/docs/api/accounting/most-commonly-used/employee)
+ [Estimate](https://developer.intuit.com/app/developer/qbo/docs/api/accounting/most-commonly-used/estimate)
+ [Invoice](https://developer.intuit.com/app/developer/qbo/docs/api/accounting/most-commonly-used/invoice)
+ [Item](https://developer.intuit.com/app/developer/qbo/docs/api/accounting/most-commonly-used/item)
+ [Payment](https://developer.intuit.com/app/developer/qbo/docs/api/accounting/most-commonly-used/payment)
+ [Preferences](https://developer.intuit.com/app/developer/qbo/docs/api/accounting/most-commonly-used/preferences)
+ [ProfitAndLoss](https://developer.intuit.com/app/developer/qbo/docs/api/accounting/most-commonly-used/profitandloss)
+ [TaxAgency](https://developer.intuit.com/app/developer/qbo/docs/api/accounting/most-commonly-used/taxagency)
+ [Vendor](https://developer.intuit.com/app/developer/qbo/docs/api/accounting/most-commonly-used/vendor)

## Partitioning queries


**Field-based partitioning**:

In QuickBooks, the Integer and DateTime datatype fields support field-based partitioning.

You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an **inclusive** lower bound value of the chosen partition field.

  For the Datetime field, we accept the Spark timestamp format used in Spark SQL queries.

  Examples of valid value:

  ```
  "2024-05-07T02:03:00.00Z"
  ```
+ `UPPER_BOUND`: an **exclusive** upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`: the number of partitions.

Example:

```
QuickBooks_read = glueContext.create_dynamic_frame.from_options(
    connection_type="quickbooks",
    connection_options={
        "connectionName": "connectionName",
        "REALMID": "12345678690123456789",
        "ENTITY_NAME": "Account",
        "API_VERSION": "v3",
        "PARTITION_FIELD": "MetaData_CreateTime"
        "LOWER_BOUND": "2023-09-07T02:03:00.000Z"
        "UPPER_BOUND": "2024-05-07T02:03:00.000Z"
        "NUM_PARTITIONS": "10"
    }
```

**Record-based partitioning**:

The original query is splitinto `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently:
+ `NUM_PARTITIONS`: the number of partitions.

Example:

```
QuickBooks_read = glueContext.create_dynamic_frame.from_options(
    connection_type="quickbooks",
    connection_options={
        "connectionName": "connectionName",
        "REALMID": "1234567890123456789",
        "ENTITY_NAME": "Bill",
        "API_VERSION": "v3",
        "NUM_PARTITIONS": "10"
    }
```

# QuickBooks connection options


The following are connection options for QuickBooks:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in QuickBooks.
+ `INSTANCE_URL`(String) - (Required) A valid QuickBooks instance URL.
+ `API_VERSION`(String) - (Required) Used for Read. QuickBooks Rest API version you want to use.
+ `REALM_ID`(String) - An ID that identifies an individual QuickBooks Online company where you send requests.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query.
+ `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field.
+ `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read.

# Limitations and notes for QuickBooks connector


The following are limitations or notes for the QuickBooks connector:
+ In the `taxAgency` API, order by filtering is not working as expected.

# Connecting to a REST API


 AWS Glue allows you to configure an AWS Glue ConnectionType that can be used to connect AWS Glue to any REST API-based data source. This can be used as a data source in your ETL Jobs. You can run these jobs to transfer data between the REST API-based data source and AWS services or other supported applications. 

**Topics**
+ [

# AWS Glue support for REST API
](rest-api-support.md)
+ [

# Policies containing the API operations for registering connection types and creating/using connections
](rest-api-configuring-iam-permissions.md)
+ [

# Configuring a REST API ConnectionType
](rest-api-configuring.md)
+ [

# Configuring a REST API connection
](rest-api-configuring-connections.md)
+ [

# Tutorial: Creating a REST API ConnectionType and Connection
](rest-api-example.md)
+ [

# Limitations
](rest-api-limitations.md)

# AWS Glue support for REST API


AWS Glue supports REST API as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from a REST API-based data source.

**Supported as a target?**  
No.

# Policies containing the API operations for registering connection types and creating/using connections
IAM policies

 The following sample IAM policy describes the required permissions for registering, creating, managing and using the REST API connections within AWS Glue ETL jobs. If you are creating a new role, create a policy that contains the following: 

```
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "glue:RegisterConnectionType",
                "glue:ListConnectionTypes",
                "glue:DescribeConnectionType",
                "glue:CreateConnection",
                "glue:RefreshOAuth2Tokens",
                "glue:ListEntities",
                "glue:DescribeEntity"
            ],
            "Resource": "*"
        }
    ]
}
```

You can also use the following IAM policies to allow access:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

If providing Network Options when creating a REST API connection, the following actions must also be included in the IAM role:

```
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "secretsmanager:DescribeSecret",
                "secretsmanager:GetSecretValue",
                "secretsmanager:PutSecretValue",
                "ec2:CreateNetworkInterface",
                "ec2:DescribeNetworkInterfaces",
                "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
        }
    ]
}
```

# Configuring a REST API ConnectionType


 Before you can use AWS Glue to transfer data from the REST API-based data source, you must meet these requirements: 

## Minimum requirements


The following are the minimum requirements:
+  You have configured and registered an AWS Glue REST API connection type. See [Connecting to REST APIs](https://docs.aws.amazon.com/glue/latest/dg/rest-api-connections.html). 
+  If using OAuth2 Client Credentials, Authorization Code or JWT, configure the client app accordingly. 

 If you meet these requirements, you’re ready to connect AWS Glue to your REST API-based data source. Typically, no further configurations are needed on the REST API side. 

# Configuring a REST API connection


 In order to configure an AWS Glue REST API connector, you need to configure an AWS Glue connection type. This connection type contains details about the properties of how the REST data source operates and interprets things like authentication, requests, responses, pagination, validations, and entities/ metadata. For a comprehensive list of the required properties for an AWS Glue REST connection type, see the [ RegisterConnectionType](https://docs.aws.amazon.com/glue/latest/webapi/API_DescribeConnectionType.html) API and the steps for [Connecting to REST APIs](https://docs.aws.amazon.com/glue/latest/dg/rest-api-connections.html). 

 When creating the REST API connector, the following policy is needed to allow relevant actions: 

```
{
    "Version":"2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "glue:RegisterConnectionType",
                "glue:ListConnectionTypes",
                "glue:DescribeConnectionType",
                "glue:CreateConnection",
                "secretsmanager:DescribeSecret",
                "secretsmanager:GetSecretValue",
                "secretsmanager:PutSecretValue",
                "ec2:CreateNetworkInterface",
                "ec2:DescribeNetworkInterfaces",
                "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
        }
    ]
}
```

# Tutorial: Creating a REST API ConnectionType and Connection


**Connecting to the Foo REST API**

 We will create an AWS Glue REST API ConnectionType and a corresponding AWS Glue connection for the Foo REST API. This API has the following properties (which can be retrieved from REST API documentation). 
+  **Instance URL**: https://foo.cloud.com/rest/v1. 
+  **Authentication type**: OAuth2 (Client Credentials). 
+  **REST method**: GET. 
+  **Pagination type**: Offset with properties “limit” and “offset” placed in query parameter of request. 
+ **Supported entities**:
  +  **Bar**: relative path [/bar.json]. 
  +  **Baz**: relative path [/baz.json]. 

 Once all the details are obtained, we can begin creating the AWS Glue connection to the Foo REST API. 

**To create a REST API connection**:

1.  Create the REST API connection type in AWS Glue by calling the RegisterConnectionType API using AWS API, CLI, or SDK. This will create a new ConnectionType resource in AWS Glue. 

   ```
   {
       "ConnectionType": "REST-FOO-CONNECTOR",
       "IntegrationType": "REST",
       "Description": "AWS Glue Connection Type for the FOO REST API",
       "ConnectionProperties": {
           "Url": {
               "Name": "Url",
               "Required": true,
               "DefaultValue": "https://foo.cloud.com/rest/v1",
               "PropertyType": "USER_INPUT"
           }
       },
       "ConnectorAuthenticationConfiguration": {
           "AuthenticationTypes": ["OAUTH2"],
           "OAuth2Properties": {
               "OAuth2GrantType": "CLIENT_CREDENTIALS"
           }
       },
       "RestConfiguration": {
           "GlobalSourceConfiguration": {
           "RequestMethod": "GET",
           "ResponseConfiguration": {
               "ResultPath": "$.result",
               "ErrorPath": "$.error.message"
           },
           "PaginationConfiguration": {
               "OffsetConfiguration": {
                   "OffsetParameter": {
                       "Key": "offset",
                       "PropertyLocation": "QUERY_PARAM"
                   },
                   "LimitParameter": {
                       "Key": "limit",
                       "PropertyLocation": "QUERY_PARAM",
                       "DefaultValue": "50"
                   }
               }
           }
       },
       "ValidationEndpointConfiguration": {
           "RequestMethod": "GET",
           "RequestPath": "/bar.json?offset=1&limit=10"
       },
       "EntityConfigurations": {
           "bar": {
               "SourceConfiguration": {
                   "RequestMethod": "GET",
                   "RequestPath": "/bar.json",
                   "ResponseConfiguration": {
                       "ResultPath": "$.result",
                       "ErrorPath": "$.error.message"
                   }
               },
               "Schema": {
                   "name": {
                       "Name": "name",
                       "FieldDataType": "STRING"
                   },
                   "description": {
                       "Name": "description",
                       "FieldDataType": "STRING"
                   },
                   "id": {
                       "Name": "id",
                       "FieldDataType": "STRING"
                   },
                   "status": {
                       "Name": "status",
                       "FieldDataType": "STRING"
                   }
               }
           }
       }
   }
   }
   ```

1.  In AWS Secrets Manager, create a secret. The Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key. 
**Note**  
You must create a secret per connection in AWS Glue

1.  Create the AWS Glue connection by calling the CreateConnection API using the AWS API, CLI, or SDK. 

   1.  Reference the REST connection type name from Step 1 as the “ConnectionType”. 

   1.  Provide the InstanceUrl and any other ConnectionProperties that were defined during the AWS Glue ConnectionType registration process. 

   1.  Choose from the configured Authentication Types. The REST API Foo uses OAuth2 with the ClientCredentials grant type. 

   1.  Provide the **SecretArn** and other **AuthenticationProperties** that are configured. For example, we have configured `OAUTH2` as the AuthenticationType so we will set the “OAuth2Properties” in the CreateConnectionInput. This will require properties like “OAuth2GrantType”, “TokenUrl”, and “OAuth2ClientApplication”. 

1.  Make the CreateConnection request which will create the AWS Glue connection. 

   ```
   {
       "ConnectionInput": {
           "Name": "ConnectionFooREST",
           "ConnectionType": "REST-FOO-CONNECTOR",
           "ConnectionProperties": {},
           "ValidateCredentials": true,
           "AuthenticationConfiguration": {
               "AuthenticationType": "OAUTH2",
               "SecretArn": "arn:aws:secretsmanager:<region>:<accountId>:secret:<secretId>",
               "OAuth2Properties": {
                   "OAuth2GrantType": "CLIENT_CREDENTIALS",
                   "TokenUrl": "https://foo.cloud.com/oauth/token",
                   "OAuth2ClientApplication": {
                       "UserManagedClientApplicationClientId": "your-managed-client-id"
                   }
               }
           }
       }
   }
   ```

# Limitations


The following are limitations for the REST API connector
+  REST API connector is only available through the AWS API, CLI, or SDK. You cannot configure REST connectors through the console. 
+  The AWS Glue REST ConnectionType can only be configured to READ data from the REST API-based data source. The connection can only be used as a SOURCE in AWS Glue ETL jobs. 
+  Filtering and partitioning is not supported. 
+  Field selection is not supported. 

# Connecting to Salesforce
Connecting to Salesforce

Salesforce provides customer relationship management (CRM) software that help you with sales, customer service, e-commerce, and more. If you're a Salesforce user, you can connect AWS Glue to your Salesforce account. Then, you can use Salesforce as a data source or destination in your ETL Jobs. Run these jobs to transfer data between Salesforce and AWS services or other supported applications.

**Topics**
+ [

# AWS Glue support for Salesforce
](salesforce-support.md)
+ [

# Policies containing the API operations for creating and using connections
](salesforce-configuring-iam-permissions.md)
+ [

# Configuring Salesforce
](salesforce-configuring.md)
+ [

## Apply System Admin profile
](#salesforce-configuring-apply-system-admin-profile)
+ [

# Configuring Salesforce connections
](salesforce-configuring-connections.md)
+ [

# Reading from Salesforce
](salesforce-reading-from-entities.md)
+ [

# Writing to Salesforce
](salesforce-writing-to.md)
+ [

# Salesforce connection options
](salesforce-connection-options.md)
+ [

# Limitations for the Salesforce connector
](salesforce-connector-limitations.md)
+ [

# Set up the Authorization Code flow for Salesforce
](salesforce-setup-authorization-code-flow.md)
+ [

# Set up the JWT bearer OAuth flow for Salesforce
](salesforce-setup-jwt-bearer-oauth.md)

# AWS Glue support for Salesforce


AWS Glue supports Salesforce as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Salesforce.

**Supported as a target?**  
Yes. You can use AWS Glue ETL jobs to write records into Salesforce.

**Supported Salesforce API versions**  
The following Salesforce API versions are supported
+ v58.0
+ v59.0
+ v60.0

# Policies containing the API operations for creating and using connections
IAM policies

The following sample IAM policy describes the required permissions for creating, managing and using Salesforce connections within AWS Glue ETL jobs. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "secretsmanager:DescribeSecret",
        "secretsmanager:GetSecretValue",
        "secretsmanager:PutSecretValue",
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

You can also use the following IAM policies to allow access:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

If providing Network Options when creating a Salesforce connection, the following actions must also be included in the IAM role:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ec2:CreateNetworkInterface",
        "ec2:DescribeNetworkInterfaces",
        "ec2:DeleteNetworkInterface"
      ],
      "Resource": "*"
    }
  ]
}
```

------

 For Zero-ETL Salesforce connections, see [Zero-ETL prerequisites](https://docs.aws.amazon.com/glue/latest/dg/zero-etl-prerequisites.html). 

 For Zero-ETL Salesforce connections, see [Zero-ETL prerequisites](https://docs.aws.amazon.com/glue/latest/dg/zero-etl-prerequisites.html). 

# Configuring Salesforce


Before you can use AWS Glue to transfer data to or from Salesforce, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have a Salesforce account.
+ Your Salesforce account is enabled for API access. API access is enabled by default for the Enterprise, Unlimited, Developer, and Performance editions.

If you meet these requirements, you’re ready to connect AWS Glue to your Salesforce account. AWS Glue handles the remaining requirements with the AWS managed connected app.

## The AWS managed connected app for Salesforce


The AWS managed connected app helps you create a Salesforce connection in fewer steps. In Salesforce, a connected app is a framework that authorizes external applications, like AWS Glue, to access your Salesforce data using OAuth 2.0. To use the AWS managed connected app, create a Salesforce connection by using the AWS Glue consule. When you configure the connection, set the **OAuth grant type** to **Authorization code** and leave the box checked for **Use AWS managed client application**.

When saving the connection, you will be redirected to Salesforce to login and approve AWS Glue access to your Salesforce account.

## Apply System Admin profile


 In Salesforce, follow the steps to apply the System Admin profile: 

1.  In Salesforce, navigate to **Settings > Connected Apps > Connected Apps OAuth Usage**. 

1.  In the list of connected apps, find AWS Glue and choose **Install**. If needed, choose **Unblock**. 

1.  Navigate to **Settings > Manage Connected Apps then choose AWS Glue**. Under OAuth Policies, choose **Admin approved users are pre-authorized** and select the **System Admin** profile. This action restricts access to AWS Glue only to users with the System Admin profile. 

## Apply System Admin profile


 In Salesforce, follow the steps to apply the System Admin profile: 

1.  In Salesforce, navigate to **Settings > Connected Apps > Connected Apps OAuth Usage**. 

1.  In the list of connected apps, find AWS Glue and choose **Install**. If needed, choose **Unblock**. 

1.  Navigate to **Settings > Manage Connected Apps then choose AWS Glue**. Under OAuth Policies, choose **Admin approved users are pre-authorized** and select the **System Admin** profile. This action restricts access to AWS Glue only to users with the System Admin profile. 

# Configuring Salesforce connections


To configure a Salesforce connection:

1. In AWS Secrets Manager, create a secret with the following details:

   1. For the JWT\$1TOKEN grant type - the secret should contain the JWT\$1TOKEN key with its value.

   1. For the AuthorizationCode grant type:

      1. For an AWS Managed connected app, an empty secret or a secret with some temporary value must be provided.

      1. For a customer managed connected app, the secret should contain the connected app `Consumer Secret` with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as the key.

   1. Note: You must create a secret for your connection in AWS Glue.

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Connection type**, select Salesforce.

   1. Provide the INSTANCE\$1URL of the Salesforce instance you want to connect to.

   1. Provide the Salesforce environment.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the OAuth2 grant type which you want to use for the connections. The grant type determines how AWS Glue communicates with Salesforce to request access to your data. Your choice affects the requirements that you must meet before you create the connection. You can choose either of these types:
      + **JWT\$1BEARER Grant Type**: This grant type works well for automation scenarios as it allows a JSON Web Token (JWT) to be created up front with the permissions of a particular user in the Salesforce instance. The creator has control over how long the JWT is valid for. AWS Glue is able to use the JWT to obtain an access token which is used to call Salesforce APIs.

        This flow requires that the user has created a connected app in their Salesforce instance which enables issuing JWT-based access tokens for users.

        For information on creating a connected app for the JWT bearer OAuth flow, see [OAuth 2.0 JWT bearer flow for server-to-server integration](https://help.salesforce.com/s/articleView?id=sf.remoteaccess_oauth_jwt_flow.htm). To set up the JWT bearer flow with the Salesforce connected app, see [Set up the JWT bearer OAuth flow for Salesforce](salesforce-setup-jwt-bearer-oauth.md).
      + **AUTHORIZATION\$1CODE Grant Type**: This grant type is considered a "three-legged" OAuth as it relies on redirecting users to the third-party authorization server to authenticate the user. It is used when creating connections via the AWS Glue console. The user creating a connection may by default rely on an AWS Glue connected app (AWS Glue managed client application) where they do not need to provide any OAuth related information except for their Salesforce instance URL. The AWS Glue console will redirect the user to Salesforce where the user must login and allow AWS Glue the requested permissions to access their Salesforce instance.

        Users may still opt to create their own connected app in Salesforce and provide their own client ID and client secret when creating connections through the AWS Glue console. In this scenario, they will still be redirected to Salesforce to login and authorize AWS Glue to access their resources.

        This grant type results in a refresh token and access token. The access token is short lived, and may be refreshed automatically without user interaction using the refresh token.

        For information on creating a connected app for the Authorization Code OAuth flow, see [Set up the Authorization Code flow for Salesforce](salesforce-setup-authorization-code-flow.md).

   1. Select the `secretName` which you want to use for this connection in AWS Glue to store the OAuth 2.0 tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

1. If providing network options, also grant the IAM role the following permissions:

------
#### [ JSON ]

****  

   ```
   {
     "Version":"2012-10-17",		 	 	 
     "Statement": [
       {
         "Effect": "Allow",
         "Action": [
           "ec2:CreateNetworkInterface",
           "ec2:DescribeNetworkInterfaces",
           "ec2:DeleteNetworkInterface"
         ],
         "Resource": "*"
       }
     ]
   }
   ```

------

## Configuring Salesforce connections with the AWS CLI


You can create Salesforce connections using the AWS CLI:

```
aws glue create-connection --connection-input \
"{\"Name\": \"salesforce-conn1\",\"ConnectionType\": \"SALESFORCE\",\"ConnectionProperties\": {\"ROLE_ARN\": \"arn:aws:iam::123456789012:role/glue-role\",\"INSTANCE_URL\": \"https://example.my.salesforce.com\"},\"ValidateCredentials\": true,\"AuthenticationConfiguration\": {\"AuthenticationType\": \"OAUTH2\",\"SecretArn\": \"arn:aws:secretsmanager:us-east-1:123456789012:secret:salesforce-conn1-secret-IAmcdk\",\"OAuth2Properties\": {\"OAuth2GrantType\": \"JWT_BEARER\",\"TokenUrl\": \"https://login.salesforce.com/services/oauth2/token\"}}}" \
--endpoint-url https://glue.us-east-1.amazonaws.com \
--region us-east-1
```

# Reading from Salesforce


**Prerequisite**

A Salesforce sObject you would like to read from. You will need the object name such as `Account` or `Case` or `Opportunity`.

**Example**:

```
salesforce_read = glueContext.create_dynamic_frame.from_options(
    connection_type="salesforce",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "Account",
        "API_VERSION": "v60.0"
    }
)
```

## Partitioning queries


You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an **inclusive** lower bound value of the chosen partition field.

  For Date or Timestamp fields, the connector accepts the Spark timestamp format used in Spark SQL queries.

  Examples of valid values:

  ```
  "TIMESTAMP \"1707256978123\""
  "TIMESTAMP '2018-01-01 00:00:00.000 UTC'"
  "TIMESTAMP \"2018-01-01 00:00:00 Pacific/Tahiti\"" 
  "TIMESTAMP \"2018-01-01 00:00:00\""
  "TIMESTAMP \"-123456789\" Pacific/Tahiti"
  "TIMESTAMP \"1702600882\""
  ```
+ `UPPER_BOUND`: an **exclusive** upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`: the number of partitions.
+  `TRANSFER_MODE`: supports two modes: `SYNC` and `ASYNC`. Default is `SYNC`. When set to `ASYNC`, Bulk API 2.0 Query will be utilized for processing. 

Example:

```
salesforce_read = glueContext.create_dynamic_frame.from_options(
    connection_type="salesforce",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "Account",
        "API_VERSION": "v60.0",
        "PARTITION_FIELD": "SystemModstamp",
        "LOWER_BOUND": "TIMESTAMP '2021-01-01 00:00:00 Pacific/Tahiti'",
        "UPPER_BOUND": "TIMESTAMP '2023-01-10 00:00:00 Pacific/Tahiti'",
        "NUM_PARTITIONS": "10",
        "TRANSFER_MODE": "ASYNC" 
    }
)
```

## FILTER\$1PREDICATE option


**FILTER\$1PREDICATE**: It is an optional parameter. This option is used for query filter.

Examples of **FILTER\$1PREDICATE**:

```
     Case 1: FILTER_PREDICATE with single criterion
     Examples: 	
       LastModifiedDate >= TIMESTAMP '2025-04-01 00:00:00 Pacific/Tahiti'
       LastModifiedDate <= TIMESTAMP "2025-04-01 00:00:00"
       LastModifiedDate >= TIMESTAMP '2018-01-01 00:00:00.000 UTC'
       LastModifiedDate <= TIMESTAMP "-123456789 Pacific/Tahiti"
       LastModifiedDate <= TIMESTAMP "1702600882"

     Case 2: FILTER_PREDICATE with multiple criteria
     Examples: 
       LastModifiedDate >= TIMESTAMP '2025-04-01 00:00:00 Pacific/Tahiti' AND Id = "0012w00001CotGiAAJ"
       LastModifiedDate >= TIMESTAMP "1702600882" AND Id = "001gL000002i26MQAQ"

     Case 3: FILTER_PREDICATE single criterion with LIMIT
     Examples: 
       LastModifiedDate >= TIMESTAMP "1702600882" LIMIT 2

     Case 4: FILTER_PREDICATE with LIMIT
     Examples: 
       LIMIT 2
```

# Writing to Salesforce


**Prerequisites**

A Salesforce sObject you would like to write to. You will need the object name such as `Account` or `Case` or `Opportunity`.

The Salesforce connector supports four write operations:
+ INSERT
+ UPSERT
+ UPDATE
+ DELETE

When using the `UPSERT` write operation, the `ID_FIELD_NAMES` option must be provided to specify the external ID field for the records.

 You can also add connection options: 
+  `TRANSFER_MODE`: Supports two modes: `SYNC` and `ASYNC`. Default is `SYNC`. When set to `ASYNC`, Bulk API 2.0 Ingest will be utilized for processing. 
+  `FAIL_ON_FIRST_ERROR`: The default value is `FALSE`, which means the AWS Glue job will continue processing all the data even if there are some failed write records. When set to `TRUE`, the AWS Glue job will fail if there are any failed write records, and it will not continue processing. 

**Example**

```
salesforce_write = glueContext.write_dynamic_frame.from_options(
    frame=frameToWrite,
    connection_type="salesforce",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "Account",
        "API_VERSION": "v60.0",
        "WRITE_OPERATION": "INSERT",
        "TRANSFER_MODE": "ASYNC",
        "FAIL_ON_FIRST_ERROR": "true"
    }
)
```

# Salesforce connection options


The following connection options are supported for the Salesforce connector:
+ `ENTITY_NAME`(String) - (Required) Used for Read/Write. The name of your Object in Salesforce.
+ `API_VERSION`(String) - (Required) Used for Read/Write. Salesforce Rest API version you want to use.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.

  When providing a filter predicate, only the `AND` operator is supported. Other operators such as `OR` and `IN` are not currently supported.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query.
+ `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field.
+ `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field. 
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read.
+ `IMPORT_DELETED_RECORDS`(String) - Default: FALSE. Used for read. To get the delete records while querying.
+ `WRITE_OPERATION`(String) - Default: INSERT. Used for write. Value should be INSERT, UPDATE, UPSERT, DELETE.
+ `ID_FIELD_NAMES`(String) - Default : null. Required for UPDATE and UPSERT.

# Limitations for the Salesforce connector


The following are limitations for the Salesforce connector:
+ We only support Spark SQL and Salesforce SOQL is not supported.
+ Job bookmarks are not supported.
+ Salesforce field names are case sensitive. When writing to Salesforce, data must match the casing of the fields defined within Salesforce.

# Set up the Authorization Code flow for Salesforce


Refer to Salesforce public documentation for enabling the OAuth 2.0 Authorization Code flow.

To configure the connected app:

1. Activate the **Enable OAuth Settings** checkbox.

1. In the **Callback URL** text field, enter one or more redirect URLs for AWS Glue.

   Redirect URLs have the following format:

   https://*region*.console.aws.amazon.com/gluestudio/oauth

   In this URL, *region* is the code for the AWS Region where you use AWS Glue to transfer data from Salesforce. For example, the code for the US East (N. Virginia) Region is `us-east-1`. For that Region, the URL is the following:

   https://us-east-1.console.aws.amazon.com/gluestudio/oauth

   For the AWS Regions that AWS Glue supports, and their codes, see [AWS Glue endpoints and quotas](https://docs.aws.amazon.com/general/latest/gr/glue.html) in the *AWS General Reference*.

1. Activate the **Require Secret for Web Server Flow** checkbox.

1. In the **Available OAuth Scopes** list, add the following scopes:
   + Manage user data via APIs (api)
   + Access custom permissions (custom\$1permissions)
   + Access the identity URL service (id, profile, email, address, phone)
   + Access unique user identifiers (openid)
   + Perform requests at any time (refresh\$1token, offline\$1access)

1. Set the refresh token policy for the connected app to **Refresh token is valid until revoked**. Otherwise, your jobs will fail when your refresh token expires. For more information on how to check and edit the refresh token policy, see [Manage OAuth Access Policies for a Connected App](https://help.salesforce.com/articleView?id=connected_app_manage_oauth.htm) in the Salesforce documentation.

# Set up the JWT bearer OAuth flow for Salesforce


Refer to Salesforce public documentation for enabling server-to-server integration with [OAuth 2.0 JSON Web Tokens](https://help.salesforce.com/s/articleView?id=sf.remoteaccess_oauth_jwt_flow.htm).

Once you have created a JWT and configured the connected app appropriately in Salesforce, you can create a new Salesforce connection with the `JWT_TOKEN` key set in your Secrets Manager Secret. Set the OAuth grant type to **JWT Bearer Token** when creating the connection.

# Connecting to Salesforce Marketing Cloud
Connecting to Salesforce Marketing Cloud

Salesforce Marketing Cloud is a provider of marketing automation and analytics software for email, mobile, social, and online marketing. It also offers consulting and implementation services. As a Salesforce Marketing Cloud user, you can connect AWS Glue to your Salesforce Marketing Cloud account. Then, you can use Salesforce Marketing Cloud as a data source or destination in your ETL jobs. Run these jobs to transfer data between Salesforce Marketing Cloud and AWS services or other supported applications.

**Topics**
+ [

# AWS Glue support for Salesforce Marketing Cloud
](salesforce-marketing-cloud-support.md)
+ [

# Policies containing the API operations for creating and using connections
](salesforce-marketing-cloud-configuring-iam-permissions.md)
+ [

# Configuring Salesforce Marketing Cloud
](salesforce-marketing-cloud-configuring.md)
+ [

# Configuring Salesforce Marketing Cloud connections
](salesforce-marketing-cloud-configuring-connections.md)
+ [

# Reading from Salesforce Marketing Cloud entities
](salesforce-marketing-cloud-reading-from-entities.md)
+ [

# Writing to Salesforce Marketing Cloud entities
](salesforce-marketing-cloud-writing-to-entities.md)
+ [

# Salesforce Marketing Cloud connection options
](salesforce-marketing-cloud-connection-options.md)
+ [

# Limitations and notes for Salesforce Marketing Cloud connector
](salesforce-marketing-cloud-connector-limitations.md)

# AWS Glue support for Salesforce Marketing Cloud


AWS Glue supports Salesforce Marketing Cloud as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Salesforce Marketing Cloud.

**Supported as a target?**  
No.

**Supported Salesforce Marketing Cloud API versions**  
The following Salesforce Marketing Cloud API versions are supported:
+ v1

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring Salesforce Marketing Cloud


Before you can use AWS Glue to transfer data from Salesforce Marketing Cloud, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have a Salesforce Marketing Cloud account. For more information, see [Creating a Salesforce Marketing Cloud account](#salesforce-marketing-cloud-configuring-creating-salesforce-marketing-cloud-account).
+ Your Salesforce Marketing Cloud account is enabled for API access. API access is enabled by default for the Enterprise, Unlimited, Developer, and Performance editions.

If you meet these requirements, you’re ready to connect AWS Glue to your Salesforce Marketing Cloud account. For typical connections, you don't need do anything else in Salesforce Marketing Cloud.

## Creating a Salesforce Marketing Cloud account


For Salesforce Marketing cloud, you need to contact the vendor for account creation. If you or your company has an association with Salesforce, contact your Salesforce account manager to request a Salesforce Marketing Cloud license. Otherwise, you can request contact from a Salesforce representative as follows: 

1. Go to https://www.salesforce.com/in/products/marketing-cloud/overview/ and choose **Sign up**.

1. Select the **Contact Us** link on the top right of the page.

1. Enter the required information in the form and choose **Contact Me**.

A Salesforce representative will contact you to discuss your requirements.

## Creating a project and OAuth 2.0 credentials


To get a project and OAuth 2.0 credentials:

1. Log into your [Salesforce Marketing Cloud instance](https://mc.login.exacttarget.com/hub-cas/login) with your username and password and authenticate using your registered mobile number.

1. Click on your profile at the top right corner and then go to **Setup**.

1. Under **Platform Tools** choose **Apps** and then choose **Installed Packages**.  
![\[\]](http://docs.aws.amazon.com/glue/latest/dg/images/sfmc-platform-tools.png)

1. On the **Installed Packages** page, click **New** at the top right corner. Provide the name and description of the package.

   Save the package. After the package is saved, you can view the package details.

1. On the **Details** page of the package, under the **Component** section, choose **Add Component**.   
![\[\]](http://docs.aws.amazon.com/glue/latest/dg/images/sfmc-add-component.png)

1. Select the **Component Type** as 'API Integration' and click **Next**.

1. Select the **Integration Type** as 'Server-to-Server' (which has the client credentials OAuth grant type) and click **Next**.

1. Add the scopes based on your requirements and click **Save**.

# Configuring Salesforce Marketing Cloud connections


Salesforce Marketing Cloud supports the CLIENT CREDENTIALS grant type for OAuth2.
+ This grant type is considered 2-legged OAuth 2.0 as it is used by clients to obtain an access token outside of the context of a user. AWS Glue is able to use the client ID and client secret to authenticate the Salesforce Marketing Cloud APIs which are provided by custom services that you define.
+ Each custom service is owned by an API-only user which has a set of roles and permissions which authorize the service to perform specific actions. An access token is associated with a single custom service.
+ This grant type results in an access token which is short lived, and may be renewed by calling an identity endpoint.
+ For public Salesforce Marketing Cloud documentation for OAuth 2.0 with client credentials, see [Set Up Your Development Environment for Enhanced Packages](https://developer.salesforce.com/docs/marketing/marketing-cloud/guide/mc-dev-setup-enhanced.html).

To configure a Salesforce Marketing Cloud connection:

1. In AWS Secrets Manager, create a secret with the following details:

   1. For the customer managed connected app, the Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key.

   1. Note: You must create a secret for your connections in AWS Glue.

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Connection type**, select Salesforce Marketing Cloud.

   1. Provide the `Subdomain Endpoint` of the Salesforce Marketing Cloud you want to connect to.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

# Reading from Salesforce Marketing Cloud entities


**Prerequisite**

A Salesforce Marketing Cloud object you would like to read from. You will need the object name such as `Activity` or `Campaigns`. The following table shows the supported entities.

**Supported entities for source**:


| Entity | Interface | Can be filtered | Supports limit | Supports Order by | Supports SELECT \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | --- | 
| Event Notification Callback | REST | No | No | No | Yes | No | 
| Seed-List | REST | No | Yes | No | Yes | No | 
| Setup | REST | Yes | Yes | No | Yes | No | 
| Domain Verification | REST | Yes | Yes | Yes | Yes | No | 
| Objects Nested Tags | REST | Yes | No | No | Yes | No | 
| Contact | REST | No | Yes | No | Yes | No | 
| Event Notification Subscription | REST | No | No | No | Yes | No | 
| Messaging | REST | No | Yes | No | Yes | No | 
| Activity | SOAP | No | No | No | Yes | Yes | 
| Bounce Event | SOAP | No | No | No | Yes | Yes | 
| Click Event | SOAP | No | No | No | Yes | Yes | 
| Content Area | SOAP | No | No | No | Yes | Yes | 
| Data Extension | SOAP | No | Yes | No | Yes | Yes | 
| Email | SOAP | No | Yes | No | Yes | Yes | 
| Forwarded Email Event | SOAP | No | Yes | No | Yes | Yes | 
| Forward Email OptInEvent | SOAP | No | Yes | No | Yes | Yes | 
| Link | SOAP | No | Yes | No | Yes | Yes | 
| Link Send | SOAP | No | Yes | No | Yes | Yes | 
| List | SOAP | No | Yes | No | Yes | Yes | 
| List Subscriber | SOAP | No | Yes | No | Yes | Yes | 
| Not Sent Event | SOAP | No | Yes | No | Yes | Yes | 
| Open Event | SOAP | No | Yes | No | Yes | Yes | 
| Send | SOAP | No | Yes | No | Yes | Yes | 
| Sent Event | SOAP | No | Yes | No | Yes | Yes | 
| Subscriber | SOAP | No | Yes | No | Yes | Yes | 
| Survey Event | SOAP | No | Yes | No | Yes | Yes | 
| Unsub Event | SOAP | No | Yes | No | Yes | Yes | 
| Audit Events | REST | No | Yes | Yes | Yes | No | 
| Campaigns | REST | No | Yes | Yes | Yes | No | 
| Interactions | REST | No | Yes | Yes | Yes | No | 
| Content Assets | REST | No | Yes | Yes | Yes | No | 

**Example for REST**:

```
salesforcemarketingcloud_read = glueContext.create_dynamic_frame.from_options(
    connection_type="salesforcemarketingcloud",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "Campaigns",
        "API_VERSION": "v1",
        "INSTANCE_URL": "https://**********************.rest.marketingcloudapis.com"
    }
)
```

**Example for SOAP**:

```
salesforcemarketingcloud_read = glueContext.create_dynamic_frame.from_options(
    connection_type="salesforcemarketingcloud",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "Activity",
        "API_VERSION": "v1",
        "INSTANCE_URL": "https://**********************.soap.marketingcloudapis.com"
    }
)
```

**Salesforce Marketing Cloud entity and field details**:

The following tables describe the Salesforce Marketing Cloud entities. There are REST entities with static metadata and SOAP entities with dynamic metadata.

**REST entities with static metadata**:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/salesforce-marketing-cloud-reading-from-entities.html)

**SOAP entities with dynamic metadata**:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/salesforce-marketing-cloud-reading-from-entities.html)

## Partitioning queries


In Salesforce Marketing Cloud, the Integer and DateTime datatype fields support field-based partitioning.

You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an **inclusive** lower bound value of the chosen partition field.

  For the timestamp field, we accept the Spark timestamp format used in Spark SQL queries.

  Examples of valid value:

  ```
  “2024-05-07T02:03:00.00Z"
  ```
+ `UPPER_BOUND`: an **exclusive** upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`: the number of partitions.

Example:

```
salesforcemarketingcloud_read = glueContext.create_dynamic_frame.from_options(
    connection_type="salesforcemarketingcloud",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "ListSubscriber",
        "API_VERSION": "v1",
        "PARTITION_FIELD": "CreatedDate",
        "LOWER_BOUND": "2023-09-07T02:03:00.000Z",
        "UPPER_BOUND": "2024-05-07T02:03:00.000Z",
        "NUM_PARTITIONS": "10"
    }
)
```

# Writing to Salesforce Marketing Cloud entities


**Prerequisites**
+ A Salesforce Marketing object you wish to write to. You will need to specify the object’s name such as `List` or `Campaigns` or any of the other entities outlined in the table below.
+ The Salesforce Marketing Cloud connector supports three write operations:
  + INSERT
  + UPSERT
  + UPDATE

  When using the `UPDATE` and `UPSERT` write operations, you must provide the `ID_FIELD_NAMES` option to specify the external ID field for the records. 

**Supported entities for destination**:


| Entity | Priority | Interface (REST, SOAP, etc) | Can be Inserted | Can be Updated | Can be Upserted | 
| --- | --- | --- | --- | --- | --- | 
| Campaigns | P0 | REST | Y- Single | Y- Single | N | 
| Content Assets | P0 | REST | Y- Single, Bulk | Y- Single | N | 
| Contact | P1 | REST | Y- Single | Y- Single | N | 
| Domain Verification | P1 | REST | Y- Single | Y- Single, Bulk | N | 
| Event Notification Callback | P1 | REST | Y- Single | Y- Single | N | 
| Event Notification Subscription | P1 | REST | Y- Single | Y- Single | N | 
| Messaging | P1 | REST | Y- Single | N | N | 
| Object Nested Tag | P2 | REST | Y- Single | Y- Single | N | 
| Seed-List | P1 | REST | Y- Single | Y- Single | N | 
| Setup | P1 | REST | Y- Single | Y- Single | N | 
| Data Extension | P0 | SOAP | Y- Single | Y- Single | Y- Single | 
| Email | P0 | SOAP | Y- Single | Y- Single | N | 
| List | P0 | SOAP | Y- Single | Y- Single | N | 
| Send | P0 | SOAP | Y- Single | N | N | 
| Subscriber | P0 | SOAP | Y- Single | Y- Single | N | 

**Example for INSERT operation for REST**:

```
salesforcemarketingcloud_write = glueContext.write_dynamic_frame.from_options(
    connection_type="salesforcemarketingcloud",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "Campaigns",
        "API_VERSION": "v1",
        "writeOperation" : "INSERT",
        "INSTANCE_URL": "https://**********************.rest.marketingcloudapis.com"
    }
)
```

**Example for INSERT operation for SOAP**:

```
salesforcemarketingcloud_write = glueContext.write_dynamic_frame.from_options(
    connection_type="salesforcemarketingcloud",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "List",
        "API_VERSION": "v1",
        "writeOperation" : "INSERT",
        "INSTANCE_URL": "https://**********************.rest.marketingcloudapis.com"
    }
)
```

**Example for UPDATE operation for REST**:

```
salesforcemarketingcloud_write = glueContext.write_dynamic_frame.from_options(
    connection_type="salesforcemarketingcloud",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "Campaigns",
        "API_VERSION": "v1",
        "writeOperation" : "UPDATE",
         "ID_FIELD_NAMES": "id",
        "INSTANCE_URL": "https://**********************.rest.marketingcloudapis.com"
    }
)
```

**Example for UPDATE operation for SOAP**:

```
salesforcemarketingcloud_write = glueContext.write_dynamic_frame.from_options(
    connection_type="salesforcemarketingcloud",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "List",
        "API_VERSION": "v1",
        "writeOperation" : "UPDATE",
         "ID_FIELD_NAMES": "id",
        "INSTANCE_URL": "https://**********************.rest.marketingcloudapis.com"
    }
)
```

**Example for UPSERT operation for SOAP**:

```
salesforcemarketingcloud_write = glueContext.write_dynamic_frame.from_options(
    connection_type="salesforcemarketingcloud",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "DataExtension/Insert-***E/6*******3",
        "API_VERSION": "v1",
        "writeOperation" : "UPSERT",
        "INSTANCE_URL": "https://**********************.rest.marketingcloudapis.com"
    }
)
```

# Salesforce Marketing Cloud connection options


The following are connection options for Salesforce Marketing Cloud:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in Salesforce Marketing Cloud.
+ `API_VERSION`(String) - (Required) Used for Read. Salesforce Marketing Cloud Rest and SOAP API version you want to use.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query.
+ `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field.
+ `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field. 
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read.
+ `WRITE_OPERATION`(String) - Default: INSERT. Used for write. Value should be INSERT, UPDATE, UPSERT.
+ `ID_FIELD_NAMES`(String) - Default: null. Required for UPDATE/UPSERT.

# Limitations and notes for Salesforce Marketing Cloud connector


The following are limitations or notes for the Salesforce Marketing Cloud connector:
+ When using filter on DateTime datatype field, you need to pass the value in the format "yyyy-mm-ddThh:MM:ssZ".
+ In Data Preview, the Boolean Datatype value comes as a Blank.
+ For SOAP entities, you can define a maximum of two filters, and for REST entities, you can define only one filter, which restricts testing partitioning with filters.
+ Several unexpected behaviors have been observed from the SaaS side: the `Link.Alias` field in the `linksend` entity does not support the CONTAINS operator (for example, `Link.Alias CONTAINS "ViewPrivacyPolicy"`), and filter operators for Data Extension entities (such as EQUALS and GREATER THAN) do not return the expected results.
+ The SFMC ClickEvent SOAP API has a delay in reflecting newly created records so, the records created recently may not be immediately available in the API response.

  Example: If you create 5 new ClickEvent records at **2025-01-10T14:30:00** and immediately fetch them using the SOAP API, the response might not include all 5 records. It may take up to 5 minutes for the newly created records to appear in the API response. This delay can affect both data retrieval and scheduled runs as well.
+ Two different DateTime formats: **2025-03-11T04:46:00** (without milliseconds) and **2025-03-11T04:46:00.000Z** are supported when performing write operations in AWS Glue (with milliseconds).
+ For the Event Notification Subscription entity, a subscription can only be created for a verified callback URL, and you can have up to 200 subscriptions per callback.
+ For the Event Notification Callback entity, a maximum of 50 records can be created per account.

# Connecting to Salesforce Commerce Cloud


 The B2C Commerce API is a collection of RESTful APIs for interacting with B2C Commerce instances. It goes by a few different names: Salesforce Commerce API, the acronym SCAPI, or just Commerce API.

 The API enables developers to build a wide range of applications: from full storefronts to custom merchant tools to augment Business Manager. For all B2C Commerce customers, the API is available at no extra cost. 

 The API is broken into two main groups of APIs: Shopper APIs and Admin APIs. And within each group, they're divided into API families and into smaller groups focused on related functionality. 

**Topics**
+ [

# AWS Glue support for Salesforce Commerce Cloud
](salesforce-commerce-cloud-support.md)
+ [

# Policies containing the API operations for creating and using connections
](salesforce-commerce-cloud-configuring-iam-permissions.md)
+ [

# Configuring Salesforce Commerce Cloud
](salesforce-commerce-cloud-configuring.md)
+ [

# Configuring Salesforce Commerce Cloud connections
](salesforce-commerce-cloud-configuring-connections.md)
+ [

# Reading from Salesforce Commerce Cloud entities
](salesforce-commerce-cloud-reading-from-entities.md)
+ [

# Salesforce Commerce Cloud connection option reference
](salesforce-commerce-cloud-connection-options.md)
+ [

# Limitations
](salesforce-commerce-cloud-connector-limitations.md)

# AWS Glue support for Salesforce Commerce Cloud


AWS Glue supports Salesforce Commerce Cloud as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Salesforce Commerce Cloud.

**Supported as a target?**  
No.

**Supported Salesforce Commerce Cloud API versions**  
 v1. 

# Policies containing the API operations for creating and using connections
IAM policies

 The following sample policy describes the required AWS permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

You can also use the following managed IAM policies to allow access:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring Salesforce Commerce Cloud


Before you can use AWS Glue to transfer data from Salesforce Commerce Cloud, you must meet these requirements:

## Minimum requirements

+  You have a Salesforce Commerce Cloud client application with clientId and clientSecret. 
+  Your Salesforce Commerce Cloud account is enabled for API access. 

 If you meet these requirements, you’re ready to connect AWS Glue to your Salesforce Commerce Cloud account. For typical connections, you don't need do anything else in Salesforce Commerce Cloud. 

# Configuring Salesforce Commerce Cloud connections


 Salesforce Commerce Cloud supports CLIENT CREDENTIALS grant type for OAuth2. 
+  This grant type is considered 2-legged OAuth 2.0 as it is used by clients to obtain an access token outside of the context of a user. AWS Glue is able to use the client Id and client secret to authenticate Salesforce Commerce Cloud APIs which are provided by custom services that you define. 
+  Each custom service is owned by an API-Only user which has a set of roles and permissions which authorize the service to perform specific actions. An access token is associated with a single custom service. 
+  This grant type results in an access token which is short lived, and may be renewed by calling identity endpoint. 
+  For more information on Salesforce Commerce Cloud documentation on generating the Client credentials, see [ Salesforce documentation ](https://developer.salesforce.com/docs/commerce/commerce-api/guide/authorization.html). 

To configure an Salesforce Commerce Cloud connection:

1.  In AWS Secrets Manager, create a secret with the following details. It is required to create a secret for each connection in AWS Glue. 

   1.  For customer managed connected app - Secret should contain the connected app Consumer Secret with USER\$1MANAGED\$1CLIENT\$1APPLICATION\$1CLIENT\$1SECRET as key. 

1. In AWS Glue Studio, create a connection under **Data Connections** by following the steps below: 

   1.  Under Data Connections, choose **Create connection**. 

   1. When selecting a **Data Source**, select Salesforce Commerce Cloud.

   1. Provide your Salesforce Commerce Cloud **Short Code**, **Organization ID**, and **Site ID**.

   1. Select the Salesforce Commerce Cloud Domain URL of your Salesforce Commerce Cloud account.

   1.  Select the IAM role which AWS Glue can assume and has permissions for following actions: 

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1.  Provide the OAuth scopes - optional, User Managed Client Application ClientId of the Salesforce Commerce Cloud you want to connect to. 

   1.  Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. 

   1.  Select the network options if you want to use your network. 

1.  Grant the IAM role associated with your AWS Glue job permission to read `secretName`. 

1.  In your AWS Glue job configuration, provide `connectionName` as an **Additional network connection**. 

# Reading from Salesforce Commerce Cloud entities


 **Prerequisites** 
+  A Salesforce Commerce Cloud Object you would like to read from. Refer the supported entities table below to check the available entities. 

 **Supported entities** 


| Entity | Can be Filtered | Supports Limit | Supports Order By | Supports Select \$1 | Supports Partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Assignments | Yes | Yes | Yes | Yes | Yes | 
| Campaigns | Yes | Yes | Yes | Yes | Yes | 
| Catalogs | Yes | Yes | Yes | Yes | Yes | 
| Categories | Yes | Yes | Yes | Yes | Yes | 
| Coupons | Yes | Yes | Yes | Yes | Yes | 
| Gift Certificates | Yes | Yes | Yes | Yes | Yes | 
| Products | Yes | Yes | Yes | Yes | Yes | 
| Promotions | Yes | Yes | Yes | Yes | Yes | 
| Source Code Groups | Yes | Yes | Yes | Yes | Yes | 

 **Example** 

```
salesforce_commerce_cloud_read = glueContext.create_dynamic_frame.from_options(
     connection_type="SalesforceCommerceCloud",
     connection_options={
         "connectionName": "connectionName",
         "ENTITY_NAME": "campaign",
         "API_VERSION": "v1"      
     }
)
```

 **Salesforce Commerce Cloud entity and field details** 

 Entities list: 
+  Assignments: [ https://developer.salesforce.com/docs/commerce/commerce-api/references/assignments ]( https://developer.salesforce.com/docs/commerce/commerce-api/references/assignments) 
+  Campaigns: [ https://developer.salesforce.com/docs/commerce/commerce-api/references/campaigns ](https://developer.salesforce.com/docs/commerce/commerce-api/references/campaigns) 
+  Catalogs: [ https://developer.salesforce.com/docs/commerce/commerce-api/references/catalogs ](https://developer.salesforce.com/docs/commerce/commerce-api/references/catalogs) 
+  Categories: [ https://developer.salesforce.com/docs/commerce/commerce-api/references/catalogs?meta=searchCategories ](https://developer.salesforce.com/docs/commerce/commerce-api/references/catalogs?meta=searchCategories) 
+  Gift Certificates: [ https://developer.salesforce.com/docs/commerce/commerce-api/references/gift-certificates ](https://developer.salesforce.com/docs/commerce/commerce-api/references/gift-certificates) 
+  Products: [ https://developer.salesforce.com/docs/commerce/commerce-api/references/products ](https://developer.salesforce.com/docs/commerce/commerce-api/references/products) 
+  Promotions: [ https://developer.salesforce.com/docs/commerce/commerce-api/references/promotions ](https://developer.salesforce.com/docs/commerce/commerce-api/references/promotions) 
+  Source Code Groups: [ https://developer.salesforce.com/docs/commerce/commerce-api/references/source-code-groups ](https://developer.salesforce.com/docs/commerce/commerce-api/references/source-code-groups) 

 **Partitioning queries** 

 Additional spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, `NUM_PARTITIONS` can be provided if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by spark tasks concurrently. 
+  `PARTITION_FIELD`: the name of the field to be used to partition query. 
+  `LOWER_BOUND`: an inclusive lower bound value of the chosen partition field. 

   For date, we accept the Spark date format used in Spark SQL queries. Example of valid values: `"2024-02-06"`. 
+  `UPPER_BOUND`: an exclusive upper bound value of the chosen partition field. 
+  `NUM_PARTITIONS`: number of partitions. 

 Entity-wise partitioning field support details are captured in below table: 


| Entity | Partitioning Field | DataType | 
| --- | --- | --- | 
| Campaigns | lastModified | DateTime | 
| Campaigns | startDate | DateTime | 
| Campaigns | endDate | DateTime | 
| Catalogs | creationDate | DateTime | 
| Categories | creatiionDate | DateTime | 
| Gift Certificates | merchantId | String | 
| Gift Certificates | creatiionDate | DateTime | 
| Products | creatiionDate | DateTime | 
| Products | lastModified | DateTime | 
| Source Code Groups | creationDate | DateTime | 
| Source Code Groups | startTime | DateTime | 
| Source Code Groups | endTime | DateTime | 

 **Example** 

```
 salesforceCommerceCloud_read = glueContext.create_dynamic_frame.from_options(
     connection_type="SalesforceCommerceCloud",
     connection_options={
         "connectionName": "connectionName",
         "ENTITY_NAME": "coupons",
         "API_VERSION": "v1",
         "PARTITION_FIELD": "creationDate",
         "LOWER_BOUND": "2020-05-01T20:55:02.000Z",
         "UPPER_BOUND": "2024-07-11T20:55:02.000Z",
         "NUM_PARTITIONS": "10"
     }
)
```

# Salesforce Commerce Cloud connection option reference


The following are connection options for Salesforce Commerce Cloud:
+  `ENTITY_NAME`(String) - (Required) Used for Read. The name of your Object in Salesforce Commerce Cloud. 
+  `API_VERSION`(String) - (Required) Used for Read/Write. Salesforce Commerce Cloud Rest API version you want to use. Example: v1. 
+  `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object. 
+  `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format. 
+  `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query. 
+  `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query. 
+  `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field. 
+  `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field. 
+  `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read. 

# Limitations


The following are limitations for the Salesforce Commerce Cloud connector:
+ The Contains filter is not working as expected when partitioning.
+ CDN Zones' entity doesn't support sandbox instances, and it supports only Development and production instance types. For more information, see [ https://help.salesforce.com/s/articleView?id=cc.b2c\$1embedded\$1cdn\$1overview.htm ](https://help.salesforce.com/s/articleView?id=cc.b2c_embedded_cdn_overview.htm).
+ In Salesforce Commerce Cloud, there is no API endpoint to fetch Dynamic Metadata. As a result, there is no provision to support the custom fields in the Product and Category entity.
+ Site id is a mandatory query parameter. You must pass the Site Id value through the Custom Connector Setting. For more information, see [Base URL and Request Formation ](https://developer.salesforce.com/docs/commerce/commerce-api/guide/base-url.html).
+ You can apply filters on maximum two fields (excluding Levels if present) in single API request with the combination of different operators as mentioned in the below table:    
<a name="salesforce-commerce-cloud-limitations-filters"></a>[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/salesforce-commerce-cloud-connector-limitations.html)
+ In some of the entities, the data type for the fields while retrieving is different from when it is used as searchable fields. As a result, there is no provision of filter feature for these fields. The following table provides the details about such fields.     
<a name="salesforce-commerce-cloud-limitations-filters-provision"></a>[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/salesforce-commerce-cloud-connector-limitations.html)

# Connecting to Salesforce Marketing Cloud Account Engagement
Connecting to Salesforce Marketing Cloud Account Engagement

Salesforce Marketing Cloud Account Engagement is a marketing automation solution that helps companies create meaningful connections, generate more pipeline, and empower sales to close more deals. If you are a Salesforce Marketing Cloud Account Engagement user, you can connect AWS Glue to your Salesforce Marketing Cloud Account Engagement account. You can use Salesforce Marketing Cloud Account Engagement as a data source in your ETL jobs. Run these jobs to transfer data from Salesforce Marketing Cloud Account Engagement to AWS services or other supported applications.

**Topics**
+ [

# AWS Glue support for Salesforce Marketing Cloud Account Engagement
](salesforce-marketing-cloud-account-engagement-support.md)
+ [

# Policies containing the API operations for creating and using connections
](salesforce-marketing-cloud-account-engagement-configuring-iam-permissions.md)
+ [

# Configuring Salesforce Marketing Cloud Account Engagement
](salesforce-marketing-cloud-account-engagement-configuring.md)
+ [

# Configuring Salesforce Marketing Cloud Account Engagement connections
](salesforce-marketing-cloud-account-engagement-configuring-connections.md)
+ [

# Reading from Salesforce Marketing Cloud Account Engagement entities
](salesforce-marketing-cloud-account-engagement-reading-from-entities.md)
+ [

# Salesforce Marketing Cloud Account Engagement connection options
](salesforce-marketing-cloud-account-engagement-connection-options.md)
+ [

# Limitations and notes for Salesforce Marketing Cloud Account Engagement connector
](salesforce-marketing-cloud-account-engagement-connector-limitations.md)

# AWS Glue support for Salesforce Marketing Cloud Account Engagement


AWS Glue supports Salesforce Marketing Cloud Account Engagement as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Salesforce Marketing Cloud Account Engagement in either Async or Sync mode.

**Supported as a target?**  
No.

**Supported Salesforce Marketing Cloud Account Engagement API versions**  
The following Salesforce Marketing Cloud Account Engagement API versions are supported:
+ v5

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring Salesforce Marketing Cloud Account Engagement


Before you can use AWS Glue to transfer data from Salesforce Marketing Cloud Account Engagement, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have a Salesforce marketing account.
+ You have a licensed Account Engagement plan for the Salesforce account. 
+ You have synced the Salesforce user with the Account Engagement user.
+ You have created a new connected app under App Manager to obtain OAuth Credentials.

If you meet these requirements, you’re ready to connect AWS Glue to your Salesforce Marketing Cloud Account Engagement account.

# Configuring Salesforce Marketing Cloud Account Engagement connections


The grant type determines how AWS Glue communicates with Salesforce Marketing Cloud Account Engagement to request access to your data. Your choice affects the requirements that you must meet before you create the connection. Salesforce Marketing Cloud Account Engagement supports only the AUTHORIZATION\$1CODE grant type for OAuth 2.0.
+ This grant type is considered "three-legged" OAuth as it relies on redirecting users to a third-party authorization server to authenticate the user. It is used when creating connections via the AWS Glue console.
+ Users may still opt to create their own connected app in Salesforce Marketing Cloud Account Engagement and provide their own client ID and client secret when creating connections through the AWS Glue console. In this scenario, they will still be redirected to Salesforce Marketing Cloud Account Engagement to login and authorize AWS Glue to access their resources.
+ This grant type results in a refresh token and access token. The access token is short lived, and may be refreshed automatically without user interaction using the refresh token.
+ For public Salesforce Marketing Cloud Account Engagement documentation on creating a connected app for Authorization Code OAuth flow, see [Authentication](https://developer.salesforce.com/docs/marketing/pardot/guide/version5overview.html#authentication).

To configure a Salesforce Marketing Cloud Account Engagement connection:

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Connection type**, select Salesforce Marketing Cloud Account Engagement.

   1. Provide the `INSTANCE_URL` of the Salesforce Marketing Cloud Account Engagement instance you want to connect to.

   1. Provide the `PARDOT_BUSINESS_UNIT_ID` of the Salesforce Marketing Cloud Account Engagement instance you want to connect to.

   1. Select the appropriate **Authorization Code URL** from the dropdown.

   1. Select the appropriate **Token URL** from the dropdown.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Provide the User Managed Client Application Client ID (the client ID from the connected app).

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. The selected secret needs to have a key `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` with the value being the Client Secret from the connected app.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

1. In your AWS Glue job configuration, provide `connectionName` as an **Additional network connection**.

# Reading from Salesforce Marketing Cloud Account Engagement entities


**Prerequisite**

A Salesforce Marketing Cloud Account Engagement object you would like to read from. You will need the object name.

**Supported entities for Sync source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Campaign | Yes | Yes | Yes | Yes | Yes | 
| Dynamic Content | Yes | Yes | Yes | Yes | Yes | 
| Email | Yes | Yes | Yes | Yes | Yes | 
| Email Template | Yes | Yes | Yes | Yes | Yes | 
| Engagement Studio Program | Yes | Yes | Yes | Yes | Yes | 
| Folder Contents | Yes | Yes | Yes | Yes | Yes | 
| Landing Page | Yes | Yes | Yes | Yes | Yes | 
| Lifecycle History | Yes | Yes | Yes | Yes | Yes | 
| Lifecycle Stage | Yes | Yes | Yes | Yes | Yes | 
| List | Yes | Yes | Yes | Yes | Yes | 
| List Email | Yes | Yes | Yes | Yes | Yes | 
| List Membership | Yes | Yes | Yes | Yes | Yes | 
| Opportunity | Yes | Yes | Yes | Yes | Yes | 
| Prospect | Yes | Yes | Yes | Yes | Yes | 
| Prospect Account | Yes | Yes | Yes | Yes | Yes | 
| User | Yes | Yes | Yes | Yes | Yes | 

**Example**:

```
salesforcepardot_read = glueContext.create_dynamic_frame.from_options(
    connection_type="SalesforcePardot",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "API_VERSION": "v5"
    }
   )
```

**Supported entities for Async source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Campaign | Yes | No | No | Yes | No | 
| Dynamic Content | Yes | No | No | Yes | No | 
| Email Template | Yes | No | No | Yes | No | 
| Landing Page | Yes | No | No | Yes | No | 
| Lifecycle History | Yes | No | No | Yes | No | 
| Lifecycle Stage | Yes | No | No | Yes | No | 
| List | Yes | No | No | Yes | No | 
| List Email | Yes | No | No | Yes | No | 
| List Membership | Yes | No | No | Yes | No | 
| Opportunity | Yes | No | No | Yes | No | 
| Prospect | Yes | No | No | Yes | No | 
| Prospect Account | Yes | No | No | Yes | No | 
| User | Yes | No | No | Yes | No | 

**Example**:

```
salesforcepardot_read = glueContext.create_dynamic_frame.from_options(
    connection_type="SalesforcePardot",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "API_VERSION": "v5",
        "TRANSFER_MODE": "ASYNC"
    }
   )
```

**Salesforce Marketing Cloud Account Engagement entity and field details**:

To view the field details for thhe following entities, navigate to [Salesforce Marketing Cloud Account Engagement API](https://developer.salesforce.com/docs/marketing/pardot), choose **Guides**, scroll down to **Open Source API Wrappers**, expand **Version 5 Docs** from the menu and choose an entity.

Entities list:
+ Campaign
+ Dynamic Content
+ Email
+ Email Template
+ Engagement Studio Program
+ Folder Content
+ Landing Page
+ Lifecycle History
+ Lifecycle Stage
+ List
+ List Email
+ List Membership
+ Opportunity
+ Prospect
+ Prospect Account
+ User

In addition to the fields mentioned above, the Async mode supports specific filterable fields for each entity, as shown in the table below.


| Entity | Additional filterable fields supported in Async | 
| --- | --- | 
| Campaign | createdAfter, createdBefore, deleted, updatedAfter, updatedBefore | 
| Dynamic Content | createdAfter, createdBefore, deleted, updatedAfter, updatedBefore | 
| Email Template | createdAfter, createdBefore, deleted, updatedAfter, updatedBefore | 
| Engagement Studio Program | - | 
| Landing Page | createdAfter, createdBefore, deleted, updatedAfter, updatedBefore | 
| Lifecycle History | createdAfter, createdBefore | 
| Lifecycle Stage | createdAfter, createdBefore, deleted, updatedAfter, updatedBefore | 
| List | createdAfter, createdBefore, deleted, updatedAfter, updatedBefore | 
| List Email | createdAfter, createdBefore, deleted, updatedAfter, updatedBefore | 
| List Membership | createdAfter, createdBefore, deleted, updatedAfter, updatedBefore | 
| Opportunity | createdAfter, createdBefore, deleted, updatedAfter, updatedBefore | 
| Prospect | createdAfter, createdBefore, deleted, updatedAfter, updatedBefore | 
| Prospect Account | createdAfter, createdBefore, deleted | 
| User | createdAfter, createdBefore, deleted, updatedAfter, updatedBefore | 

For more information about the additional fields, refer to [Salesforce Export API](https://developer.salesforce.com/docs/marketing/pardot/guide/export-v5.html#procedures)

Note the following considerations for the connector:
+ The value of the `delete` field in the entities can be `false` (default), `true`, or `all`.

## Partitioning queries


**Filter-based partitioning**:

You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an **inclusive** lower bound value of the chosen partition field.

  For the Datetime field, we accept the Spark timestamp format used in SPark SQL queries.

  Examples of valid value:

  ```
  "2022-01-01T01:01:01.000Z"
  ```
+ `UPPER_BOUND`: an **exclusive** upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`: the number of partitions.
+ `PARTITION_BY`: the type of partitioning to be performed. "FIELD" is to be passed in case of field-based partitioning.

Example:

```
salesforcepardot_read = glueContext.create_dynamic_frame.from_options(
    connection_type="salesforcepardot",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "API_VERSION": "v5",
        "PARTITION_FIELD": "createdAt"
        "LOWER_BOUND": "2022-01-01T01:01:01.000Z"
        "UPPER_BOUND": "2024-01-01T01:01:01.000Z"
        "NUM_PARTITIONS": "10",
        "PARTITION_BY": "FIELD"
    }
   )
```

# Salesforce Marketing Cloud Account Engagement connection options


The following are connection options for Salesforce Marketing Cloud Account Engagement:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in Salesforce Marketing Cloud Account Engagement.
+ `PARDOT_BUSINESS_UNIT_ID` - (Required) Used for creating a connection. The business unit ID of the Salesforce Marketing Cloud Account Engagement instance you want to connect to.
+ `API_VERSION`(String) - (Required) Used for Read. Salesforce Marketing Cloud Account Engagement Rest API version you want to use.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) -
  + In Sync mode - Default: empty. Used for Read. It should be in the Spark SQL format.
  + In Async mode - Default : Current `DateTime` value (as per user’s timezone) - 1 year. Used for Read.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query.
+ `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field.
+ `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read.
+ `INSTANCE_URL`(String) - (Required) Used for Read. A valid Salesforce Marketing Cloud Account Engagement instance URL.
+ `PARTITION_BY`(String) - (Required) Used for Read. The type of partitioning to be performed. "FIELD" is to be passed in case of field-based partitioning.
+ `TRANSFER_MODE`(String) - (Optional), Value to be used for running a job in ASYNC mode , if this option not provided job will run in SYNC mode.

# Limitations and notes for Salesforce Marketing Cloud Account Engagement connector


The following notes and limitations apply:
+ When both a limit and partitioning are applied, the limit takes precedence over the partitioning.
+ As per the API Docs, `SalesforceMarketingCloudEngagement` enforced a RateLimit on daily and concurrent requests. For more information, see [Rate Limits](https://developer.salesforce.com/docs/marketing/pardot/guide/overview.html?q=limitation#rate-limits).
+ The Export API is subject to the daily Account Engagement API call limit and the concurrent Account Engagement API call limit for your account.
+ Similar to a queue, Export/Async API calls are executed sequentially for each account. Older exports are processed before newer exports.
+ Partition is not supported in Async mode.
+ The number of selected fields specified in the Export/Async API calls can’t exceed 150.
+ The **Prospect** entity supports over 150 fields, but only 150 fields can be selected at a time. If `Select All` is chosen, some fields will be excluded. To retrieve data for these excluded fields, you have to include them in the `Selected Fields` option.

  The following is the list of excluded fields in `SELECT_ALL` - `updatedBy.firstName`, `updatedBy.lastName`, `updatedBy.jobTitle`, `updatedBy.roleName`, `updatedBy.salesforceId`, `updatedBy.createdAt`, `updatedBy.updatedAt`, `updatedBy.isDeleted`, `updatedBy.createdById`, `updatedBy.updatedById`, `updatedBy.tagReplacementLanguage`
+ Collection fields can’t be exported for Async. For example, on List Email the `senderOptions` and `replyToOptions` fields are not supported.
+ For all entities, filter is mandatory. If no filter is provided the default filter predicate is set to the `Created After` field with a value of the current date-time (adjusted to your time zone) minus one year.
+ As per Salesforce Marketing Cloud Account Engagement limitations, in Async, the maximum range to fetch data is 1 year. If a query is provided for more than 1 year, the job will throw an error. 
+ Currently, there is a bug in Salesforce Pardot. When the job includes only a single field which does not have any data, the field value is not returning correct result and instead, the field name is being returned multiple times. The Salesforce Pardot team is aware of the issue and is actively working on a resolution.

# Connecting to SAP HANA in AWS Glue Studio
Connecting to SAP HANA

 AWS Glue provides built-in support for SAP HANA. AWS Glue Studio provides a visual interface to connect to SAP HANA, author data integration jobs, and run them on the AWS Glue Studio serverless Spark runtime. 

 AWS Glue Studio creates a unified connection for SAP HANA. For more information, see [Considerations](using-connectors-unified-connections.md#using-connectors-unified-connections-considerations). 

**Topics**
+ [

# Creating a SAP HANA connection
](creating-saphana-connection.md)
+ [

# Creating a SAP HANA source node
](creating-saphana-source-node.md)
+ [

# Creating a SAP HANA target node
](creating-saphana-target-node.md)
+ [

## Advanced options
](#creating-saphana-connection-advanced-options)

# Creating a SAP HANA connection


To connect to SAP HANA from AWS Glue, you will need to create and store your SAP HANA credentials in a AWS Secrets Manager secret, then associate that secret with a SAP HANA AWS Glue connection. You will need to configure network connectivity between your SAP HANA service and AWS Glue.

**Prerequisites**:
+ If your SAP HANA service is in an Amazon VPC, configure Amazon VPC to allow your AWS Glue job to communicate with the SAP HANA service without traffic traversing the public internet.

  In Amazon VPC, identify or create a **VPC**, **Subnet** and **Security group** that AWS Glue will use while executing the job. Additionally, you need to ensure Amazon VPC is configured to permit network traffic between your SAP HANA endpoint and this location. Your job will need to establish a TCP connection with your SAP HANA JDBC port. For more information about SAP HANA ports, see the [SAP HANA documentation](https://help.sap.com/docs/HANA_SMART_DATA_INTEGRATION/7952ef28a6914997abc01745fef1b607/88e2e8bded9e4041ad3ad87dc46c7b55.html?locale=en-US). Based on your network layout, this may require changes to security group rules, Network ACLs, NAT Gateways and Peering connections.

**To configure a connection to SAP HANA:**

1. In AWS Secrets Manager, create a secret using your SAP HANA credentials. To create a secret in Secrets Manager, follow the tutorial available in [ Create an AWS Secrets Manager secret ](https://docs.aws.amazon.com//secretsmanager/latest/userguide/create_secret.html) in the AWS Secrets Manager documentation. After creating the secret, keep the Secret name, *secretName* for the next step. 
   + When selecting **Key/value pairs**, create a pair for the key `username/USERNAME` with the value *saphanaUsername*.
   + When selecting **Key/value pairs**, create a pair for the key `password/PASSWORD` with the value *saphanaPassword*.

1. In the AWS Glue console, create a connection by following the steps in [Adding an AWS Glue connection](console-connections.md). After creating the connection, keep the connection name, *connectionName*, for future use in AWS Glue. 
   + When selecting a **Connection type**, select SAP HANA.
   + When providing **SAP HANA URL**, provide the URL for your instance.

     SAP HANA JDBC URLs are in the form `jdbc:sap://saphanaHostname:saphanaPort/?databaseName=saphanaDBname,ParameterName=ParameterValue`

     AWS Glue requires the following JDBC URL parameters: 
     + `databaseName` – A default database in SAP HANA to connect to.
   + When selecting an **AWS Secret**, provide *secretName*.

After creating a AWS Glue SAP HANA connection, you will need to perform the following steps before running your AWS Glue job:
+ Grant the IAM role associated with your AWS Glue job permission to read *secretName*.

# Creating a SAP HANA source node


## Prerequisites needed

+ An AWS Glue SAP HANA connection, configured with an AWS Secrets Manager secret, as described in the previous section, [Creating a SAP HANA connection](creating-saphana-connection.md).
+ Appropriate permissions on your job to read the secret used by the connection.
+ A SAP HANA table you would like to read from, *tableName*, or query *targetQuery*.

  A table can be specified with a SAP HANA table name and schema name, in the form `schemaName.tableName`. The schema name and "." separator are not required if the table is in the default schema, "public". Call this *tableIdentifier*. Note that the database is provided as a JDBC URL parameter in `connectionName`.

## Adding a SAP HANA data source


**To add a **Data source – SAP HANA** node:**

1.  Choose the connection for your SAP HANA data source. Since you have created it, it should be available in the dropdown. If you need to create a connection, choose **Create SAP HANA connection**. For more information see the previous section, [Creating a SAP HANA connection](creating-saphana-connection.md). 

    Once you have chosen a connection, you can view the connection properties by clicking **View properties**. 

1.  Choose a **SAP HANA Source** option: 
   +  **Choose a single table** – access all data from a single table. 
   +  **Enter custom query ** – access a dataset from multiple tables based on your custom query. 

1.  If you chose a single table, enter *tableName*. 

    If you chose **Enter custom query**, enter a SQL SELECT query. 

1.  In **Custom SAP HANA properties**, enter parameters and values as needed. 

# Creating a SAP HANA target node


## Prerequisites needed

+ A AWS Glue SAP HANA connection, configured with an AWS Secrets Manager secret, as described in the previous section, [Creating a SAP HANA connection](creating-saphana-connection.md).
+ Appropriate permissions on your job to read the secret used by the connection.
+ A SAP HANA table you would like to write to, *tableName*.

  A table can be specified with a SAP HANA table name and schema name, in the form `schemaName.tableName`. The schema name and "." separator are not required if the table is in the default schema, "public". Call this *tableIdentifier*. Note that the database is provided as a JDBC URL parameter in `connectionName`.

## Adding a SAP HANA data target


**To add a **Data target – SAP HANA** node:**

1.  Choose the connection for your SAP HANA data source. Since you have created it, it should be available in the dropdown. If you need to create a connection, choose **Create SAP HANA connection**. For more information see the previous section, [Creating a SAP HANA connection](creating-saphana-connection.md). 

    Once you have chosen a connection, you can view the connection properties by clicking **View properties**. 

1. Configure **Table name** by providing *tableName*.

1.  In **Custom Teradata properties**, enter parameters and values as needed. 

## Advanced options


You can provide advanced options when creating a SAP HANA node. These options are the same as those available when programming AWS Glue for Spark scripts.

See [SAP HANA connections](aws-glue-programming-etl-connect-saphana-home.md). 

# Connecting to SAP OData
Connecting to SAP OData

SAP OData is a standard Web protocol used for querying and updating data present in SAP using ABAP (Advanced Business Application Programming), applying and building on Web technologies such as HTTP to provide access to information from a variety of external applications, platforms and devices. With the product, you can access everything you need to help you seamlessly integrate with your SAP system, application, or data.

**Topics**
+ [

# AWS Glue support for SAP OData
](sap-odata-support.md)
+ [

# Create connections
](sap-odata-creating-connections.md)
+ [

# Creating SAP OData job
](sap-odata-creating-job.md)
+ [

# Writing to SAP OData
](sap-odata-writing.md)
+ [

# Using the SAP OData state management script
](sap-odata-state-management-script.md)
+ [

# Partitioning for Non ODP entities
](sap-odata-non-odp-entities-partitioning.md)
+ [

# SAP OData connection options
](sap-odata-connection-options.md)
+ [

# SAP OData entity and field details
](sap-odata-entity-field-details.md)

# AWS Glue support for SAP OData


AWS Glue supports SAP OData as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from SAP OData.

**Supported as a target?**  
Yes. You can use AWS Glue ETL jobs to write records into SAP OData.

**Supported SAP OData API versions**  
The following SAP OData API versions are supported:
+ 2.0

**Supported sources**  
The following sources are supported:
+ ODP (Operational Data Provisioning) Sources:
  + BW Extractors (DataSources)
  + CDS Views
  + SLT
+ Non-ODP Sources, for example:
  + CDS View Services
  + RFC-based Services
  + Custom ABAP Services

**Supported SAP Components**  
The following are minimum requirements:
+ You must enable catalog service for service discovery.
  + Configure operational data provisioning (ODP) data sources for extraction in the SAP Gateway of your SAP system.
  + **OData V2.0**: Enable the OData V2.0 catalog service(s) in your SAP Gateway via transaction `/IWFND/MAINT_SERVICE`.
  + Enable OData V2.0 services in your SAP Gateway via transaction `/IWFND/MAINT_SERVICE`.
  + Your SAP OData service must support client side pagination/query options such as `$top` and `$skip`. It must also support system query option `$count`.
  + You must provide the required authorization for the user in SAP to discover the services and extract data using SAP OData services. Refer to the security documentation provided by SAP.
+ If you want to use OAuth 2.0 as an authorization mechanism, you must enable OAuth 2.0 for the OData service and register the OAuth client per SAP documentation.
+ To generate an OData service based on ODP data sources, SAP Gateway Foundation must be installed locally in your ERP/BW stack or in a hub configuration.
  + For your ERP/BW applications, the SAP NetWeaver AS ABAP stack must be at 7.50 SP02 or above.
  + For the hub system (SAP Gateway), the SAP NetWeaver AS ABAP of the hub system must be 7.50 SP01 or above for remote hub setup.
+ For non-ODP sources, your SAP NetWeaver stack version must be 7.40 SP02 or above.

**Supported Authentication Methods**  
The following authentication methods are supported:
+ Basic Authentication
+ OAuth 2.0

# Prerequisites


Prior to initiating an AWS Glue job for data extraction from SAP OData using the SAP OData connection, complete the following prerequisites:
+ The relevant SAP OData Service must be activated in the SAP system, ensuring the data source is available for consumption. If the OData service is not activated, the Glue job will not be able to access or extract data from SAP.
+ Appropriate authentication mechanisms such as basic (custom) authentication or OAuth 2.0 must be configured in SAP to ensure that the AWS Glue job can successfully establish a connection with the SAP OData service.
+ Configure IAM policies to grant the AWS Glue job appropriate permissions for accessing SAP, Secrets Manager, and other AWS resources involved in the process.
+ If the SAP system is hosted within a private network, VPC connectivity must be configured to ensure that the AWS Glue job can securely communicate with SAP without exposing sensitive data over public internet.

AWS Secrets Manager can be used to securely store sensitive information such as SAP credentials, which the AWS Glue job can dynamically retrieve at runtime. This approach eliminates the need to hard-code credentials, enhancing security and flexibility.

The following prerequisites provide step-by-step guidance on how to set up each component for a smooth integration between AWS Glue and SAP OData.

**Topics**
+ [

# SAP OData activation
](sap-odata-activation.md)
+ [

# IAM policies
](sap-odata-configuring-iam-permissions.md)
+ [

# Connectivity / VPC Connection
](sap-odata-connectivity-vpc-connection.md)
+ [

# SAP Authentication
](sap-odata-authentication.md)
+ [

# AWS Secrets Manager to store your Auth secret
](sap-odata-aws-secret-manager-auth-secret.md)

# SAP OData activation


Complete the following steps for SAP OData connection:

## ODP Sources


Before you can transfer data from an ODP provider, you must meet the following requirements:
+ You have an SAP NetWeaver AS ABAP instance.
+ Your SAP NetWeaver instance contains an ODP provider that you want to transfer data from. ODP providers include:
  + SAP DataSources (Transaction code RSO2)
  + SAP Core Data Services ABAP CDS Views
  + SAP BW or SAP BW/4HANA systems (InfoObject, DataStore Object)
  + Real-time replication of Tables and DB-Views from SAP Source System via SAP Landscape Replication Server (SAP SLT)
  + SAP HANA Information Views in SAP ABAP based Sources
+ Your SAP NetWeaver instance has the SAP Gateway Foundation component.
+ You have created an OData service that extracts data from your ODP provider. To create the OData service, you use the SAP Gateway Service Builder. To access your ODP data, Amazon AppFlow calls this service by using the OData API. For more information, see [Generating a Service for Extracting ODP Data via OData](https://help.sap.com/docs/SAP_BPC_VERSION_BW4HANA/dd104a87ab9249968e6279e61378ff66/69b481859ef34bab9cc7d449e6fff7b6.html?version=11.0) in the SAP BW/4HANA documentation.
+ To generate an OData service based on ODP data sources, SAP Gateway Foundation must be installed locally in your ERP/BW stack or in a hub configuration.
  + For your ERP/BW applications, the SAP NetWeaver AS ABAP stack must be at 7.50 SP02 or above.
  + For the hub system (SAP Gateway), the SAP NetWeaver AS ABAP of the hub system must be 7.50 SP01 or above for remote hub setup.

## Non-ODP Sources

+ Your SAP NetWeaver stack version must be 7.40 SP02 or above.
+ You must enable catalog service for service discovery.
  + **OData V2.0**: The OData V2.0 catalog service(s) can be enabled in your SAP Gateway via transaction `/IWFND/MAINT_SERVICE`
+ Your SAP OData service must support client side pagination/query options such as `$top` and `$skip`. It must also support system query option `$count`.
+ For OAuth 2.0, you must enable OAuth 2.0 for the OData service and register the OAuth client per SAP documentation and set the authorized redirect URL as follows:
  + `https://<region>.console.aws.amazon.com/gluestudio/oauth`, replacing `<region>` with the region where AWS Glue is running, example: us-east-1. 
  + You must enable secure setup for connecting over HTTPS.
+ You must provide required authorization for the user in SAP to discover the services and extract data using SAP OData services. Please refer to the security documentation provided by SAP.

# IAM policies
IAM policies

## Policies containing the API operations for creating and using connections


The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "secretsmanager:DescribeSecret",
        "secretsmanager:GetSecretValue",
        "secretsmanager:PutSecretValue"
      ],
      "Resource": "*"
    }
  ]
}
```

------

The role must grant access to all the resources used by the job, for example Amazon S3. If you don’t want to use the above method, alternatively use the following managed IAM policies.
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.
+ [SecretsManagerReadWrite](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/SecretsManagerReadWrite) – Provides read/write access to AWS Secrets Manager via the AWS Management Console. Note: this excludes IAM actions, so combine with `IAMFullAccess` if rotation configuration is required.

**IAM Policies/Permissions needed to configure VPC**

The following IAM permissions are required while using VPC connection for creating AWS Glue Connection. For more details, refer to [create an IAM policy for AWS Glue](https://docs.aws.amazon.com/glue/latest/dg/create-service-policy.html).

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ec2:CreateNetworkInterface",
        "ec2:DeleteNetworkInterface",
        "ec2:DescribeNetworkInterfaces"
      ],
      "Resource": [
        "*"
      ]
    }
  ]
}
```

------

# Connectivity / VPC Connection


Steps for VPC Connection:

1. Use existing VPC connection or create a new connection by following the [Amazon VPC documentation](https://docs.aws.amazon.com/vpc/latest/userguide/create-vpc.html).

1. Make sure you have NAT Gateway which routes the traffic to internet.

1. Choose VPC endpoint as Amazon S3 Gateway to create connection.

1. Enable DNS resolution and DNS hostname to use AWS provided DNS Services.

1. Go to created VPC and add necessary endpoints for different services like STS, AWS Glue, Secret Managers.

   1. Choose Create Endpoint.

   1. For Service Category, choose AWS Services.

   1. For Service Name, choose the service that you are connecting to.

   1. Choose VPC and Enable DNS Name.

   1. VCP Endpoints required for VPC connection:

      1. [STS](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_sts_vpc_endpoint_create.html)

      1. [AWS Glue](https://docs.aws.amazon.com/glue/latest/dg/vpc-interface-endpoints.html)

      1. [Secrets Manager](https://docs.aws.amazon.com/secretsmanager/latest/userguide/vpc-endpoint-overview.html)

## Security Group Configuration


Security group must allow traffic to its listening port from AWS Glue VPC for AWS Glue to be able to connect to it. It is a good practice to restrict the range of source IP addresses as much as possible. 

AWS Glue requires special security group that allows all inbound traffic from itself. You can create a self-referencing rule that allows all traffic originating from the security group. You can modify an existing security group and specify the security group as source.

Open the communication from the HTTPS ports of the URL endpoint (either NLB or SAP instance).

## Connectivity options

+ HTTPS connection with internal and external NLB, SSL certificate from certificate authority (CA), not self-signed SSL certificate
+ HTTPS connection with SAP instance SSL certificate from certificate authority (CA), not self-signed SSL certificate

# SAP Authentication


The SAP connector supports both CUSTOM (this is SAP BASIC authentication) and OAUTH authentication methods.

## Custom Authentication


AWS Glue supports Custom (Basic Authentication) as a method for establishing connections to your SAP systems, allowing the use of a username and password for secure access. This auth type works well for automation scenarios as it allows using username and password up front with the permissions of a particular user in the SAP OData instance. AWS Glue is able to use the username and password to authenticate SAP OData APIs. In AWS Glue, basic authorization is implemented as custom authorization.

For public SAP OData documentation for Basic Auth flow, see [HTTP Basic Authentication](https://help.sap.com/docs/SAP_SUCCESSFACTORS_PLATFORM/d599f15995d348a1b45ba5603e2aba9b/5c8bca0af1654b05a83193b2922dcee2.html).

## OAuth 2.0 Authentication


AWS Glue also supports OAuth 2.0 as a secure authentication mechanism for establishing connections to your SAP systems. This enables seamless integration while ensuring compliance with modern authentication standards and enhancing the security of data access.

## AUTHORIZATION\$1CODE Grant Type


The grant type determines how AWS Glue communicates with SAP OData to request access to your data. SAP OData supports only the `AUTHORIZATION_CODE` grant type. This grant type is considered "three-legged" OAuth as it relies on redirecting users to the third-party authorization server to authenticate the user. It is used when creating connections via the AWS Glue console. 

Users may still opt to create their own connected app in SAP OData and provide their own client ID and client secret when creating connections through the AWS Glue console. In this scenario, they will still be redirected to SAP OData to login and authorize AWS Glue to access their resources.

This grant type results in a refresh token and access token. The access token is short lived, and may be refreshed automatically without user interaction using the refresh token.

For public SAP OData documentation on creating a connected app for Authorization Code OAuth flow, see [Authentication Using OAuth 2.0](https://help.sap.com/docs/ABAP_PLATFORM_NEW/e815bb97839a4d83be6c4fca48ee5777/2e5104fd87ff452b9acb247bd02b9f9e.html).

# AWS Secrets Manager to store your Auth secret
AWS Secrets Manager

You will need to store the SAP OData connection secrets in AWS Secrets Manager, configure the necessary permissions for retrieval as specified in the [IAM policies](sap-odata-configuring-iam-permissions.md) section, and use it while creating a connection.

Use the AWS Management Console for AWS Secrets Manager to create a secret for your SAP source. For more information, see [Create an AWS Secrets Manager secret](https://docs.aws.amazon.com/secretsmanager/latest/userguide/create_secret.html). Details in AWS Secrets Manager should include the elements in the following code. 

## Custom Authentication Secret


You will need to enter your SAP system username in place of *<your SAP username>* and its password in place of *<your SAP username password>* and True or False. In this context, setting `basicAuthDisableSSO` to `true` disables Single Sign-On (SSO) for Basic Authentication requests, requiring explicit user credentials for each request. Conversely, setting it to `false` allows the use of existing SSO sessions if available.

```
{
   "basicAuthUsername": "<your SAP username>",
   "basicAuthPassword": "<your SAP username password>",
   "basicAuthDisableSSO": "<True/False>",
   "customAuthenticationType": "CustomBasicAuth"
}
```

## OAuth 2.0 Secret


In case you are using OAuth 2.0 as your authentication mechanism, the secret in the AWS Secrets Manager should have the **User Managed Client Application ClientId** in the following format. You will need to enter your SAP client secret in place of <your client secret>.

```
{"USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET": "<your client secret>"
}
```

# Create connections


To configure an SAP OData connection:

1. Sign in to the AWS Management Console and open the [AWS Glue console](https://console.aws.amazon.com/glue). In the AWS Glue Studio, create a connection by following the steps below:

   1. Click Data connections on the left panel.

   1. Click on Create connection.

   1. Select **SAP OData** in **Choose data source**

   1. Provide the **Application host URL** of the SAP OData instance you want to connect to. This application host url must be accessible over public internet for non VPC connection.

   1. Provide the **Application service path** of the SAP OData instance you want to connect to. This is the same as the catalog service path. For example: `/sap/opu/odata/iwfnd/catalogservice;v=2`. AWS Glue doesn’t accept specific object path.

   1. Provide the **Client number** of the SAP OData instance you want to connect to. Acceptable values are [001-999]. Example: 010

   1. Provide the **Port number** of the SAP OData instance you want to connect to. Example: 443

   1. Provide the **Logon language** of the SAP OData instance you want to connect to. Example: EN

   1. Select the AWS IAM role which AWS Glue can assume and has permissions as outlined in the [IAM policies](sap-odata-configuring-iam-permissions.md) section.

   1. Select the **Authentication Type** which you want to use for this connection in AWS Glue from the dropdown list: OAUTH2 or CUSTOM

      1. CUSTOM - Select the secret you created as specified in the [AWS Secrets Manager to store your Auth secret](sap-odata-aws-secret-manager-auth-secret.md) section.

      1. OAUTH 2.0 - enter the following inputs only in case of OAuth 2.0:

         1. Under **User Managed Client Application ClientId**, enter your client id.

         1. `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` (your client secret) in the AWS Secrets Manager that you created in the [AWS Secrets Manager to store your Auth secret](sap-odata-aws-secret-manager-auth-secret.md) section.

         1. Under **Authorization Code URL**, enter your authorization code URL.

         1. Under **Authorization Tokens URL**, enter your authorization token URL.

         1. Under **OAuth Scopes**, enter your OAuth scopes separated by space. Example: `/IWFND/SG_MED_CATALOG_0002 ZAPI_SALES_ORDER_SRV_0001`

   1. Select the network options if you want to use your network. For more details, see [Connectivity / VPC Connection](sap-odata-connectivity-vpc-connection.md).

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`. For more details, see the [IAM policies](sap-odata-configuring-iam-permissions.md).

1. Choose **Test connection** and test your connection. If the connection test passes, click next, enter your connection name and save your connection. Test connection functionality is not available if you have chosen Network options (VPC). 

# Creating SAP OData job


Refer to [Building visual ETL jobs with AWS Glue Studio](https://docs.aws.amazon.com/glue/latest/dg/author-job-glue.html)

# Operational Data Provisioning (ODP) Sources


Operational Data Provisioning (ODP) provides a technical infrastructure that you can use to support data extraction and replication for various target applications and supports delta mechanisms in these scenarios. In case of a delta procedure, the data from a source (ODP Provider) is automatically written to a delta queue (Operational Delta Queue – ODQ) using an update process or passed to the delta queue using an extractor interface. An ODP Provider can be a DataSource (extractors), ABAP Core Data Services Views (ABAP CDS Views), SAP BW or SAP BW/4HANA, SAP Landscape Transformation Replication Server (SLT), and SAP HANA Information Views (calculation views). The target applications (referred to as ODQ 'subscribers' or more generally “ODP Consumers”) retrieve the data from the delta queue and continue processing the data.

## Full Load


In the context of SAP OData and ODP entities, a **Full Load** refers to the process of extracting all available data from an ODP entity in a single operation. This operation retrieves the complete dataset from the source system, ensuring that the target system has a comprehensive and up-to-date copy of the entity's data. Full loads are typically used for sources that do not support incremental loads or when a refresh of the target system is required.

**Example**

You can explicitly set the `ENABLE_CDC` flag to false, when creating the DynamicFrame. Note: `ENABLE_CDC` is false by default, if you don’t want to initialize the delta queue, you don’t have to send this flag or set it to true. Not setting this flag to true will result in a full load extraction.

```
sapodata_df = glueContext.create_dynamic_frame.from_options(
    connection_type="SAPOData",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "ENABLE_CDC": "false"
    }, transformation_ctx=key)
```

## Incremental Load


An **incremental load** in the context of ODP (Operational Data Provisioning) entities involves extracting only the new or changed data (deltas) from the source system since the last data extraction, avoiding preprocessing the already processed records. This approach significantly improves efficiency, reduces data transfer volumes, enhances performance, ensures efficient synchronization between systems, and minimizes processing time, especially for large datasets that change frequently.

# Delta Token based Incremental Transfers


To enable Incremental Transfer using Change Data Capture (CDC) for ODP-enabled entities that support it, follow these steps:

1. Create the Incremental Transfer job in script mode.

1. When creating the DataFrame or Glue DynamicFrame, you need to pass the option `"ENABLE_CDC": "True"`. This option ensures that you will receive a Delta Token from SAP, which can be used for subsequent retrieval of changed data.

The delta token will be present in the last row of the dataframe, in the DELTA\$1TOKEN column. This token can be used as a connector option in subsequent calls to incrementally retrieve the next set of data.

**Example**
+ We set the `ENABLE_CDC` flag to `true`, when creating the DynamicFrame. Note: `ENABLE_CDC` is `false` by default, if you don’t want to initialize the delta queue, you don’t need to send this flag or set it to true. Not setting this flag to true will result in a full load extraction.

  ```
  sapodata_df = glueContext.create_dynamic_frame.from_options(
      connection_type="SAPOData",
      connection_options={
          "connectionName": "connectionName",
          "ENTITY_NAME": "entityName",
          "ENABLE_CDC": "true"
      }, transformation_ctx=key)
  
  # Extract the delta token from the last row of the DELTA_TOKEN column
  delta_token_1 = your_logic_to_extract_delta_token(sapodata_df) # e.g., D20241029164449_000370000
  ```
+ The extracted delta token can be passed as a an option to retrieve new events.

  ```
  sapodata_df_2 = glueContext.create_dynamic_frame.from_options(
      connection_type="SAPOData",
      connection_options={
          "connectionName": "connectionName",
          "ENTITY_NAME": "entityName",
          // passing the delta token retrieved in the last run
          "DELTA_TOKEN": delta_token_1
      } , transformation_ctx=key)
  
  # Extract the new delta token for the next run
  delta_token_2 = your_logic_to_extract_delta_token(sapodata_df_2)
  ```

Note that the last record, in which the `DELTA_TOKEN` is present, is not a transactional record from source, and is only there for the purpose of passing the delta token value.

Apart from the `DELTA_TOKEN`, the following fields are returned in each row of the dataframe. 
+ **GLUE\$1FETCH\$1SQ**: This is a sequence field, generated from the EPOC timestamp in the order the record was received, and is unique for each record. This can be used if you need to know or establish the order of changes in the source system. This field will be present only for ODP enabled entities.
+ **DML\$1STATUS**: This will show `UPDATED` for all newly inserted and updated records from the source, and `DELETED` for records that have been deleted from source.

For more details about how to manage state and reuse the delta token to retrieve changed records through an example refer to the [Using the SAP OData state management script](sap-odata-state-management-script.md) section.

## Delta Token Invalidation


A delta token is associated with the service collection and a user. If a new initial pull with `“ENABLE_CDC” : “true”` is initiated for the same service collection and the user, all previous delta tokens issued as a result of a previous initialization will be invalidated by SAP OData service. Invoking the connector with an expired delta token will lead to an exception: 

`Could not open data access via extraction API RODPS_REPL_ODP_OPEN` 

# OData Services (Non-ODP Sources)


## Full Load


For Non-ODP (Operational Data Provisioning) systems, a **Full Load** involves extracting the entire dataset from the source system and loading it into the target system. Since Non-ODP systems do not inherently support advanced data extraction mechanisms like deltas, the process is straightforward but can be resource-intensive depending on the size of the data.

## Incremental Load


For systems or entities that do not support **ODP (Operational Data Provisioning)**, incremental data transfer can be managed manually by implementing a timestamp based mechanism to track and extract changes.

**Timestamp based Incremental Transfers**

For non-ODP enabled entities(or for ODP enabled entities that don’t use the ENABLE\$1CDC flag), we can use a `filteringExpression` option in the connector to indicate the `datetime` interval for which we want to retrieve data. This method relies on a timestamp field in you data that represents when each record was last created/modified.

**Example**

Retrieving records that changed after 2024-01-01T00:00:00.000

```
sapodata_df = glueContext.create_dynamic_frame.from_options(
    connection_type="SAPOData",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "filteringExpression": "LastChangeDateTime >= 2024-01-01T00:00:00.000"
    }, transformation_ctx=key)
```

Note: In this example, `LastChangeDateTime` is the field that represents when each record was last modified. The actual field name may vary depending on your specific SAP OData entity.

To get a new subset of data in subsequent runs, you would update the `filteringExpression` with a new timestamp. Typically, this would be the maximum timestamp value from the previously retrieved data.

**Example**

```
max_timestamp = get_max_timestamp(sapodata_df)  # Function to get the max timestamp from the previous run
next_filtering_expression = f"LastChangeDateTime > {max_timestamp}"

# Use this next_filtering_expression in your next run
```

In the next section, we will provide an automated approach to manage these timestamp-based incremental transfers, eliminating the need to manually update the filtering expression between runs.

# Writing to SAP OData


 This section describes how to write data to your SAP OData Service using the AWS Glue connector for SAP OData. 

**Prerequisites**
+ Access to an SAP OData service
+ An SAP OData EntitySet Object you would like to write to. You will need the Object name.
+ Valid SAP OData credentials and a valid connection
+ Appropriate permissions as described in [IAM policies](https://docs.aws.amazon.com/glue/latest/dg/sap-odata-configuring-iam-permissions.html)

The SAP OData connector supports two write operations:
+ INSERT
+ UPDATE

While using the UPDATE write operation, ID\$1FIELD\$1NAMES must be provided to specify the external ID field for the records.

**Example:**

```
sapodata_write = glueContext.write_dynamic_frame.from_options(
    frame=frameToWrite,
    connection_type="sapodata",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "WRITE_OPERATION": "INSERT"
    }
```

# Using the SAP OData state management script


To use the SAP OData state management script in your AWS Glue job, follow these steps:
+ Download the state management script: `s3://aws-blogs-artifacts-public/artifacts/BDB-4789/sap_odata_state_management.zip ` from the public Amazon S3 bucket.
+ Upload the script to an Amazon S3 bucket that your AWS Glue job has permissions to access.
+ Reference the script in your AWS Glue job: When creating or updating your AWS Glue job, pass the `'--extra-py-files'` option referencing the script path in your Amazon S3 bucket. For example: `--extra-py-files s3://your-bucket/path/to/sap_odata_state_management.py`
+ Import and use the state management library in your AWS Glue job scripts.

## Delta-token based Incremental Transfer example


Here's an example of how to use the state management script for delta-token based incremental transfers:

```
from sap_odata_state_management import StateManagerFactory, StateManagerType, StateType

# Initialize the state manager
state_manager = StateManagerFactory.create_manager(
    manager_type=StateManagerType.JOB_TAG,
    state_type=StateType.DELTA_TOKEN,
    options={
        "job_name": args['JOB_NAME'],
        "logger": logger
    }
)

# Get connector options (including delta token if available)
key = "SAPODataNode"
connector_options = state_manager.get_connector_options(key)

# Use the connector options in your Glue job
df = glueContext.create_dynamic_frame.from_options(
    connection_type="SAPOData",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "ENABLE_CDC": "true",
        **connector_options
    }
)

# Process your data here...

# Update the state after processing
state_manager.update_state(key, sapodata_df.toDF())
```

## Timestamp based Incremental Transfer example


Here's an example of how to use the state management script for delta-token based incremental transfers:

```
from sap_odata_state_management import StateManagerFactory, StateManagerType, StateType

# Initialize the state manager
state_manager = StateManagerFactory.create_manager(
    manager_type=StateManagerType.JOB_TAG,
    state_type=StateType.DELTA_TOKEN,
    options={
        "job_name": args['JOB_NAME'],
        "logger": logger
    }
)

# Get connector options (including delta token if available)
key = "SAPODataNode"
connector_options = state_manager.get_connector_options(key)

# Use the connector options in your Glue job
df = glueContext.create_dynamic_frame.from_options(
    connection_type="SAPOData",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "ENABLE_CDC": "true",
        **connector_options
    }
)

# Process your data here...

# Update the state after processing
state_manager.update_state(key, sapodata_df.toDF())
```

In both examples, the state management script handles the complexities of storing the state(either delta token or timestamp) between job runs. It automatically retrieves the last know state when getting connector options and updates the state after processing, ensuring the each job run only processes new or changed data.

# Partitioning for Non ODP entities


In Apache Spark, partitioning refers to the way data is divided and distributed across the worker nodes in a cluster for parallel processing. Each partition is a logical chunk of data that can be processed independently by a task. Partitioning is a fundamental concept in Spark that directly impacts performance, scalability, and resource utilization. AWS Glue jobs use Spark's partitioning mechanism to divide the dataset into smaller chunks (partitions) that can be processed in parallel across the cluster's worker nodes. Note that partitioning is not applicable for ODP entities.

For more details, see [AWS Glue Spark and PySpark jobs](https://docs.aws.amazon.com/glue/latest/dg/spark_and_pyspark.html).

**Prerequisites**

An SAP OData’s Object you would like to read from. You will need the object/EntitySet name. For example: ` /sap/opu/odata/sap/API_SALES_ORDER_SRV/A_SalesOrder `.

**Example**

```
sapodata_read = glueContext.create_dynamic_frame.from_options(
    connection_type="SAPOData",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "/sap/opu/odata/sap/API_SALES_ORDER_SRV/A_SalesOrder"
    }, transformation_ctx=key)
```

## Partitioning Queries


### Field Based partitioning


You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently. Integer, Date and DateTime fields support field-based partitioning in the SAP OData connector.
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an inclusive lower bound value of the chosen partition field.

   For any field whose data type is DateTime, the Spark timestamp format used in Spark SQL queries is accepted.

  Examples of valid values: `"2000-01-01T00:00:00.000Z"` 
+ `UPPER_BOUND`: an exclusive upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`: number of partitions.
+ `PARTITION_BY`: the type partitioning to be performed, `FIELD` to be passed in case of Field based partitioning.

**Example**

```
sapodata= glueContext.create_dynamic_frame.from_options(
    connection_type="sapodata",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "/sap/opu/odata/sap/SEPM_HCM_SCENARIO_SRV/EmployeeSet",
        "PARTITION_FIELD": "validStartDate"
        "LOWER_BOUND": "2000-01-01T00:00:00.000Z"
        "UPPER_BOUND": "2020-01-01T00:00:00.000Z"
        "NUM_PARTITIONS": "10",
        "PARTITION_BY": "FIELD"
    }, transformation_ctx=key)
```

### Record Based partitioning


The original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.

Record-based partitioning is only supported for non-ODP entities, as pagination in ODP entities is supported through the next token/skip token.
+ `PARTITION_BY`: the type partitioning to be performed. `COUNT` is to be passed in case of record-based partitioning.

**Example**

```
sapodata= glueContext.create_dynamic_frame.from_options(
    connection_type="sapodata",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "/sap/opu/odata/sap/SEPM_HCM_SCENARIO_SRV/EmployeeSet",
        "NUM_PARTITIONS": "10",
        "PARTITION_BY": "COUNT"
    }, transformation_ctx=key)
```

# Limitations / Callouts

+ ODP entities are not compatible with Record Based Partitioning since pagination is handled using skip token/delta token. Consequently, for Record Based Partitioning, the default value for maxConcurrency is set to "null" irrespective of the user input.
+ When both limit and partition is applied, the limit takes precedence over partitioning.

# SAP OData connection options


The following are connection options for SAP OData:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in SAP OData.

  For example: /sap/opu/odata/sap/API\$1SALES\$1ORDER\$1SRV/A\$1SalesOrder
+ `API_VERSION`(String) - (Optional) Used for Read. SAP OData Rest API version you want to use. Example: 2.0.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.

  For example: SalesOrder
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.

  For example: `SalesOrder = "10"`
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.

  For example: `SELECT * FROM /sap/opu/odata/sap/API_SALES_ORDER_SRV/A_SalesOrder`
+ `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query.

  For example: `ValidStartDate`
+ `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field.

  For example: `"2000-01-01T00:00:00.000Z"`
+ `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field.

  For example: `"2024-01-01T00:00:00.000Z"`
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read.
+ `INSTANCE_URL`(String) - The SAP instance application host URL.

  For example: `https://example-externaldata.sierra.aws.dev`
+ `SERVICE_PATH`(String) - The SAP instance application service path.

  For example: `/sap/opu/odata/iwfnd/catalogservice;v=2`
+ `CLIENT_NUMBER`(String) - The SAP instance application client number.

  For example: 100
+ `PORT_NUMBER`(String) - Default: The SAP instance application port number.

  For example: 443
+ `LOGON_LANGUAGE`(String) - The SAP instance application logon language.

  For example: `EN`
+ `ENABLE_CDC`(String) - Defines whether to run a job with CDC enabled, that is, with track changes.

  For example: `True/False`
+ `DELTA_TOKEN`(String) - Runs an incremental data pull based on the valid Delta Token supplied. 

  For example: `D20241107043437_000463000`
+ `PAGE_SIZE`(Integer) - Defines the page size for querying the records. The default page size is 50,000. When a page size is specified, SAP returns only the defined number of records per API call, rather than the entire dataset. The connector will still provide the total number of records and handle pagination using your specified page size. If you require a larger page size, you can choose any value up to 500,000, which is the maximum allowed. Any specified page size exceeding 500,000 will be ignored. Instead, the system will use the maximum allowed page size. You can specify the page size in the AWS Glue Studio UI by adding a connection option `PAGE_SIZE` with your desired value. 

  For example: `20000`

# SAP OData entity and field details


[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/sap-odata-entity-field-details.html)

# Connecting to SendGrid
Connecting to SendGrid

SendGrid is a customer communication platform for transactional and marketing emails.
+ SendGrid connector helps in creating and managing contact lists and creating email marketing campaigns.
+ SendGrid allows online businesses, non-profits, and other online entities to create and send marketing emails to large audiences and monitor engagement with those emails.

**Topics**
+ [

# AWS Glue support for SendGrid
](sendgrid-support.md)
+ [

# Policies containing the API operations for creating and using connections
](sendgrid-configuring-iam-permissions.md)
+ [

# Configuring SendGrid
](sendgrid-configuring.md)
+ [

# Configuring SendGrid connections
](sendgrid-configuring-connections.md)
+ [

# Reading from SendGrid entities
](sendgrid-reading-from-entities.md)
+ [

# SendGrid connection options
](sendgrid-connection-options.md)
+ [

# SendGrid limitations
](sendgrid-limitations.md)

# AWS Glue support for SendGrid


AWS Glue supports SendGrid as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from SendGrid.

**Supported as a target?**  
No.

**Supported SendGrid API versions**  
The following SendGrid API versions are supported:
+ v3

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring SendGrid


Before you can use AWS Glue to transfer data from SendGrid, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have a SendGrid account with an API key.
+ Your SendGrid account has API access with a valid license.

If you meet these requirements, you’re ready to connect AWS Glue to your SendGrid account. For typical connections, you don't need do anything else in SendGrid.

# Configuring SendGrid connections


SendGrid supports custom authentication.

For public SendGrid documentation on generating the required API keys for custom authentication, see [Authentication](https://www.twilio.com/docs/sendgrid/api-reference/how-to-use-the-sendgrid-v3-api/authentication).

To configure a SendGrid connection:

1. In AWS Secrets Manager, create a secret with the following details:

   1. For the customer managed connected app, the Secret should contain the connected app Consumer Secret with `api_key` as key.

   1. Note: you must create a secret for your connections in AWS Glue.

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Connection type**, select SendGrid.

   1. Provide the `INSTANCE_URL` of the SendGrid instance you want to connect to.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

# Reading from SendGrid entities


**Prerequisite**

A SendGrid object you would like to read from. You will need the object name such as `lists`, `singlesends` or `segments`.

**Supported entities for source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Lists | No | Yes | No | Yes | No | 
| Single Sends | Yes | Yes | No | Yes | No | 
| Marketing Campaign Stats-Automations | Yes | Yes | No | Yes | No | 
| Marketing Campaign Stats-Single Sends | Yes | Yes | No | Yes | No | 
| Segments | Yes | No | No | Yes | No | 
| Contacts | Yes | No | No | Yes | No | 
| Category | No | No | No | Yes | No | 
| Stats | Yes | No | No | Yes | No | 
| Unsubscribe Groups | Yes | No | No | Yes | No | 

**Example**:

```
sendgrid_read = glueContext.create_dynamic_frame.from_options(
    connection_type="sendgrid",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "lists",
        "API_VERSION": "v3",
        "INSTANCE_URL": "instanceUrl"
    }
```

**SendGrid entity and field details**:

Entities with static metadata:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/sendgrid-reading-from-entities.html)

**Note**  
Struct and List data types are converted to String data type, and DateTime data type is converted to Timestamp in the response of the connectors.

## Partitioning queries


SendGrid doesn't support filter-based partitioning or record-based partitioning.

# SendGrid connection options


The following are connection options for SendGrid:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in SendGrid.
+ `API_VERSION`(String) - (Required) Used for Read. SendGrid Rest API version you want to use.
+ `INSTANCE_URL`(String) - (Required) Used for Read. A valid SendGrid instance URL.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.

# SendGrid limitations


The following are limitations or notes for SendGrid:
+ Incremental pull is only supported by the Stats entity on the `start_date` field and by the Contact entity on the `event_timestamp` field.
+ Pagination is only supported by the Marketing Campaign Stats (Automations), Marketing Campaign Stats (Single Sends), Single Sends, and Lists entities.
+ For the Stats entity, `start_date` is a mandatory filter parameter.
+ An API key with Restricted Access can’t support read access for the Email API and Stats entities. Use an API key with Full Access. For more information, see [API Overview](https://www.twilio.com/docs/sendgrid/api-reference/api-keys/create-api-keys#api-overview).

# Connecting to ServiceNow
Connecting to ServiceNow

ServiceNow is a cloud-based SaaS platform for automating IT management workflows. The ServiceNow platform easily integrates with other tools, letting users manage projects, teams and customer interactions using a variety of apps and plugins. As a ServiceNow user you can connect AWS Glue to your ServiceNow account. Then, you can use ServiceNow as a data source in your ETL jobs. Run these jobs to transfer data between ServiceNow and AWS services or other supported applications.

**Topics**
+ [

# AWS Glue support for ServiceNow
](servicenow-support.md)
+ [

# Policies containing the API operations for creating and using connections
](servicenow-configuring-iam-permissions.md)
+ [

# Configuring ServiceNow
](servicenow-configuring.md)
+ [

# Configuring ServiceNow connections
](servicenow-configuring-connections.md)
+ [

# Reading from ServiceNow entities
](servicenow-reading-from-entities.md)
+ [

# ServiceNow connection options
](servicenow-connection-options.md)
+ [

# Limitations and notes for ServiceNow connector
](servicenow-connector-limitations.md)

# AWS Glue support for ServiceNow


AWS Glue supports ServiceNow as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from ServiceNow.

**Supported as a target?**  
No.

**Supported ServiceNow API versions**  
The following ServiceNow API versions are supported:
+ v2

For entity support per version specific, see Supported entities for Source.

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring ServiceNow


Before you can use AWS Glue to transfer data from ServiceNow, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have a ServiceNow account with email and password. For more information, see [Creating a ServiceNow account](#servicenow-configuring-creating-servicenow-account).
+ Your ServiceNow account is enabled for API access. All use of the ServiceNow API is available at no additional cost.

If you meet these requirements, you’re ready to connect AWS Glue to your ServiceNow account.

## Creating a ServiceNow account


To create a ServiceNow account:

1. Navigate to the sign up page on servicenow.com, enter your details, and click **Continue**.

1. When you receive a verification code in your registered mail, enter that code and choose **Verify**.

1. Set up multi-factor authentication or skip doing so.

Your account is created and ServiceNow displays your profile.

## Creating a ServiceNow developer instance


Request a developer instance after logging in to ServiceNow.

1. At the [ServiceNow login page](https://signon.service-now.com/x_snc_sso_auth.do?pageId=username), enter your account credentials.

1. Choose the **ServiceNow Developer Program**.  
![\[\]](http://docs.aws.amazon.com/glue/latest/dg/images/servicenow-dev-program.png)

1. Choose **Request Instance** in the top right.

1. Enter your job responsibilities. Indicate your agreement to the terms of use, and choose **Finish setup**.

1. Once the instance is created, note your instance URL and credentials.

## Retrieving BasicAuth credentials


To retrieve Basic Auth credentials for a free account:

1. At the [ServiceNow login page](https://signon.service-now.com/x_snc_sso_auth.do?pageId=username), enter your account credentials.

1. On the home page choose the edit profile section (top right corner) and choose **Manage Instance Password**.

1. Retrieve the login credentials such as username, password, and instance URL.

**Note**  
If MFA is enabled for the account, append the MFA token to the end of the user's password in the basic auth: <username>:<password><MFA Token>

For more information, see [Building applications](https://docs.servicenow.com/bundle/xanadu-application-development/page/build/custom-application/concept/build-applications.html) in the ServiceNow documentation.

## Creating OAuth 2.0 credentials


To use OAuth2.0 in ServiceNow connector, you need to create an inbound client) to generate the Client ID and Client Secret:

1. At the [ServiceNow login page](https://signon.service-now.com/x_snc_sso_auth.do?pageId=username), enter your account credentials.

1. On the home page choose **Start Building**.

1. On the App Engine Studio page, search for **Application Registry**.

1. Choose **New** in the top right.

1. Choose the **Create an OAuth API endpoint for external clients** option.

1. Make any required changes to the OAuth configuration and choose **Update**.

   Example for Redirect URL: https://us-east-1.console.aws.amazon.com/gluestudio/oauth

1. Select the newly created OAuth client app to retrieve the Client ID and Client Secret.

1. Store the Client ID and Client Secret for further processing.

To configure OAuth in a non-production developer account:

1. Create an authentication profile using the [Create an authentication profile](https://docs.servicenow.com/bundle/washingtondc-platform-security/page/integrate/authentication/task/create-an-authentication-profile.html) topic in the ServiceNow documentation.

1. In the Authentication Profile for OAuth, select **Type** as OAuth and select the above-created inbound client to set the **OAuth Entity**.

1. If there are multiple clients, then you need to create multiple authentication profiles to set the required OAuth entity in the authentication profile.

1. If not configured, create a REST API access policy, to give access to the TABLE API. See [Create REST API access policy](https://docs.servicenow.com/bundle/washingtondc-platform-security/page/integrate/authentication/task/create-api-access-policy.html).

# Configuring ServiceNow connections


The grant type determines how AWS Glue communicates with ServiceNow to request access to your data. Your choice affects the requirements that you must meet before you create the connection. ServiceNow supports only the AUTHORIZATION\$1CODE grant type for OAuth 2.0.
+ This grant type is considered "three-legged" OAuth as it relies on redirecting users to a third-party authorization server to authenticate the user. It is used when creating connections via the AWS Glue console. The AWS Glue console will redirect the user to ServiceNow where the user must login and allow AWS Glue the requested permissions to access their ServiceNow instance.
+ Users may still opt to create their own connected app in ServiceNow and provide their own client id and client secret when creating connections through the AWS Glue console. In this scenario, they will still be redirected to ServiceNow to login and authorize AWS Glue to access their resources.
+ This grant type results in a refresh token and access token. The access token is short lived, and may be refreshed automatically without user interaction using the refresh token.
+ For public ServiceNow documentation on creating a connected app for Authorization Code OAuth flow, see [Set up OAuth](https://docs.servicenow.com/bundle/vancouver-platform-security/page/administer/security/task/t_SettingUpOAuth.html).

To configure a ServiceNow connection:

1. In AWS Secrets Manager, create a secret with the following details:

   1. For basic authentication, the Secret should contain the connected app Consumer Secret with `USERNAME` and `PASSWORD` as key.

   1. For an authorization code grant type, the Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key.

   1. Note: You must create a secret per connection in AWS Glue.

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Connection type**, select ServiceNow.

   1. Provide the INSTANCE\$1URL of the ServiceNow instance you want to connect to.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the **Authentication Type** you want to use for this connection in AWS Glue.

      1. Basic Auth: this auth type works well for automation scenarios as it allows to use username and password up front with the permissions of a particular user in the ServiceNow instance. AWS Glue is able to use the username and password to authenticate ServiceNow APIs. Enter the following inputs only in case of Basic Auth: `Username` and `Password`.

      1. OAuth2: enter the following inputs only in case of OAuth2: `ClientId`, `ClientSecret`, `Authorization URL`, `Authorization Token URL`.

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

# Reading from ServiceNow entities


**Prerequisite**

A ServiceNow Tables object you would like to read from. You will need the object name such as pa\$1bucket or incident.

**Example**:

```
servicenow_read = glueContext.create_dynamic_frame.from_options(
    connection_type="servicenow",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "pa_buckets",
        "API_VERSION": "v2"
        "instanceUrl": "https://<instance-name>.service-now.com"
    }
)
```

**ServiceNow entity and field details**:

For the following entities, ServiceNow provides endpoints to fetch metadata dynamically, so that operator support is captured at the datatype level for each entity.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/servicenow-reading-from-entities.html)

**Note**  
The Struct data type is converted to a String data type in the response of the connector.

**Note**  
`DML_STATUS` is an additional user-defined attribute used for tracking CREATED/UPDATED records.

## Partitioning queries


**Field base partitioning**:

You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/servicenow-reading-from-entities.html)
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an **inclusive** lower bound value of the chosen partition field.

  For the Datetime field, we accept the Spark timestamp format used in SPark SQL queries.

  Examples of valid value:

  ```
  "2024-01-30T06:47:51.000Z"
  ```
+ `UPPER_BOUND`: an **exclusive** upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`: the number of partitions.

The following table describes the entity partitioning field support details:

Example:

```
servicenow_read = glueContext.create_dynamic_frame.from_options(
    connection_type="servicenow",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "pa_buckets",
        "API_VERSION": "v2",
        "instanceUrl": "https://<instance-name>.service-now.com"
        "PARTITION_FIELD": "sys_created_on"
        "LOWER_BOUND": "2024-01-30T06:47:51.000Z"
        "UPPER_BOUND": "2024-06-30T06:47:51.000Z"
        "NUM_PARTITIONS": "10"
    }
```

**Record-based partitioning**:

You can provide the additional Spark option `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With this parameter, the original query is split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.

In record-based partitioning, the total number of records present is queried from the ServiceNow API, and it is divided by `NUM_PARTITIONS` number provided. The resulting number of records are then concurrently fetched by each sub-query.
+ `NUM_PARTITIONS`: the number of partitions.

Example:

```
servicenow_read = glueContext.create_dynamic_frame.from_options(
    connection_type="servicenow",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "pa_buckets",
        "API_VERSION": "v2",
        "instanceUrl": "https://<instance-name>.service-now.com"
        "NUM_PARTITIONS": "2"
    }
```

# ServiceNow connection options


The following are connection options for ServiceNow:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in ServiceNow.
+ `API_VERSION`(String) - (Required) Used for Read. ServiceNow Rest API version you want to use. For example: v1,v2,v3,v4.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query.
+ `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field. For example: 2024-01-30T06:47:51.000Z.
+ `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field. For example: 2024-06-30T06:47:51.000Z.
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read. For example: 10.
+ `INSTANCE_URL`(String) - (Required) A valid ServiceNow instance URL with format https://<instance-name>.service-now.com.
+ `PAGE_SIZE`(Integer) - Defines the page size for querying the records. The default page size is 1,000. When a page size is specified, ServiceNow returns only the defined number of records per API call, rather than the entire dataset. The connector will still provide the total number of records and handle pagination using your specified page size. If you require a larger page size, you can choose any value up to 10,000, which is the maximum allowed. Any specified page size exceeding 10,000 will be ignored. Instead, the system will use the maximum allowed page size. You can specify the page size in the AWS Glue Studio UI by adding a connection option `PAGE_SIZE` with your desired value. For example: 5000.

# Limitations and notes for ServiceNow connector


The following are limitations or notes for the ServiceNow connector:
+ As per [SaaS documentation](https://www.servicenow.com/docs/bundle/washingtondc-application-development/page/build/applications/reference/r_GlobalDefaultFields.html), `sys_created_on`, `sys_updated_on`, and `sys_mod_count` are system generated fields. The connector relies on SaaS APIs to provide these fields in the response body.
  + If SaaS doesn't generate these fields for any entity, filter based partitioning cannot be supported.
+ If SaaS APIs don't return `sys_created_on` and `sys_updated_on` fields in the response, `DML_STATUS` cannot be calculated.
+ Enhance read performance and efficiency
  + The ServiceNow connector now automatically sorts the records in ascending order by the `sys_id` field (must be present in metadata) when no ORDER BY clause is specified by the user. In this case, records will be paginated by the new optimized keyset based pagination.
  + If the ORDER BY clause is specified, the new optimization will not be used and the records will be fetched using the existing (user defined Order By and Offset-Limit based pagination) method.

# Connecting to Slack in AWS Glue Studio
Connecting to Slack

 Slack is an enterprise communications app that lets users send messages and attachments through various public and private channels. If you're a Slack user, you can connect AWS Glue to your Slack account. Then, you can use Slack as a data source in your ETL Jobs. Run these jobs to transfer data between Slack and AWS services or other supported applications. 

**Topics**
+ [

# AWS Glue support for Slack
](slack-support.md)
+ [

# Policies containing the API operations for creating and using connections
](slack-configuring-iam-permissions.md)
+ [

# Configuring Slack
](slack-configuring.md)
+ [

# Configuring Slack connections
](slack-configuring-connections.md)
+ [

# Reading from Slack entities
](slack-reading-from-entities.md)
+ [

# Slack connection options
](slack-connection-options.md)
+ [

# Limitations
](slack-limitations.md)
+ [

# Creating a new Slack account and configuring the client app
](slack-new-account-creation.md)

# AWS Glue support for Slack


AWS Glue supports Slack as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Slack.

**Supported as a target?**  
No.

**Supported Slack API versions**  
 Slack API v2. 

# Policies containing the API operations for creating and using connections
IAM policies

 The following sample policy describes the required IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

You can also use the following managed IAM policies to allow access:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, Amazon CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring Slack


Before you can use AWS Glue to transfer data to or from Slack, you must meet these requirements:

## Minimum requirements

+  You must have a Slack account. For more information, see [Creating a new Slack account and configuring the client app](slack-new-account-creation.md). 

 If you meet these requirements, you’re ready to connect AWS Glue to your Slack account. 

# Configuring Slack connections


 Slack supports the `AUTHORIZATION_CODE` grant type for OAuth 2. 

 This grant type is considered “three-legged” OAuth as it relies on redirecting users to the third party authorization server to authenticate the user. It is used when creating connections via the AWS Glue Console. The AWS Glue Console will redirect the user to Slack where the user must login and allow AWS Glue the requested permissions to access their Slack instance. 

 Users may still opt to create their own connected app in Slack and provide their own client id and client secret when creating connections through the AWS Glue Console. In this scenario, they will still be redirected to Slack to login and authorize AWS Glue to access their resources. 

 This grant type results in a refresh token and access token. The access token expires after 1 hour of creation. A new access token can be fetched using the refresh token. 

 For more information on creating a connected app for Authorization Code OAuth flow, see [ Slack API ](https://api.slack.com/quickstart). 

To configure an Slack connection:

1.  In AWS Secrets Manager, create a secret with the following details. You must create a secret for the connection in AWS Glue. 

   1.  For customer managed connected app – Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key. 

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below: 

   1. When selecting a **Connection type**, select Slack.

   1. Provide the Slack environment.

   1.  Select the IAM role which AWS Glue can assume and has permissions for following actions: 

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1.  Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. 

   1.  Select the network options if you want to use your network. 

1.  Grant the IAM role associated with your AWS Glue job permission to read `secretName`. 

# Reading from Slack entities


 **Prerequisites** 
+  A Slack object you would like to read from. 

 **Supported entities** 


| Entity | Can be Filtered | Supports Limit | Supports Order By | Supports Select \$1 | Supports Partitioning | 
| --- | --- | --- | --- | --- | --- | 
| conversations | Yes | Yes | No | Yes | Yes | 

 **Example** 

```
slack_read = glueContext.create_dynamic_frame.from_options(
    connection_type="slack",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "conversations/C058W38R5J8"
    }
)
```

 **Slack entity and field details** 


| Entity | Field | Data Type | Supported Operators | 
| --- | --- | --- | --- | 
| conversations | attachments | List | NA | 
| conversations | bot\$1id | String | NA | 
| conversations | blocks | List | NA | 
| conversations | client\$1msg\$1id | String | NA | 
| conversations | is\$1starred | Boolean | NA | 
| conversations | last\$1read | String | NA | 
| conversations | latest\$1reply | String | NA | 
| conversations | reactions | List | NA | 
| conversations | replies | List | NA | 
| conversations | reply\$1count | Integer | NA | 
| conversations | reply\$1users | List | NA | 
| conversations | reply\$1users\$1count | Integer | NA | 
| conversations | subscribed | Boolean | NA | 
| conversations | subtype | String | NA | 
| conversations | text | String | NA | 
| conversations | team | String | NA | 
| conversations | thread\$1ts | String | NA | 
| conversations | ts | String | EQUAL\$1TO, BETWEEN, LESS\$1THAN, LESS\$1THAN\$1OR\$1EQUAL\$1TO, GREATER\$1THAN, GREATER\$1THAN\$1OR\$1EQUAL\$1TO | 
| conversations | type | String | NA | 
| conversations | user | String | NA | 
| conversations | inviter | String | NA | 
| conversations | root | Struct | NA | 
| conversations | is\$1locked | Boolean | NA | 
| conversations | files | List | NA | 
| conversations | room | Struct | NA | 
| conversations | upload | Boolean | NA | 
| conversations | display\$1as\$1bot | Boolean | NA | 
| conversations | channel | String | NA | 
| conversations | no\$1notifications | Boolean | NA | 
| conversations | permalink | String | NA | 
| conversations | pinned\$1to | List | NA | 
| conversations | pinned\$1info | Struct | NA | 
| conversations | edited | Struct | NA | 
| conversations | app\$1id | String | NA | 
| conversations | bot\$1profile | Struct | NA | 
| conversations | metadata | Struct | NA | 

 **Partitioning queries** 

 Additional spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, `NUM_PARTITIONS` can be provided if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by spark tasks concurrently. 
+  `PARTITION_FIELD`: the name of the field to be used to partition query. 
+  `LOWER_BOUND`: an inclusive lower bound value of the chosen partition field. 

   For date, we accept the Spark date format used in Spark SQL queries. Example of valid value: `"2024-07-01T00:00:00.000Z"`. 
+  `UPPER_BOUND`: an exclusive upper bound value of the chosen partition field. 
+  `NUM_PARTITIONS`: number of partitions. 

 Entity wise partitioning field support details are captured in below table. 


| Entity Name | Partitioning Field | Data Type | 
| --- | --- | --- | 
| conversations | ts | String | 

 **Example** 

```
slack_read = glueContext.create_dynamic_frame.from_options(
    connection_type="slack",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "conversations/C058W38R5J8",
        "PARTITION_FIELD": "ts"
        "LOWER_BOUND": "2022-12-01T00:00:00.000Z"
        "UPPER_BOUND": "2024-09-23T15:00:00.000Z"
        "NUM_PARTITIONS": "2"
    }
)
```

# Slack connection options


The following are connection options for Slack:
+  `ENTITY_NAME`(String) - (Required) Used for Read. Supported entity name. Example: `conversations/C058W38R5J8`. 
+  `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Fields you want to select for the object. 
+  `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format. 
+  `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query. 
+  `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read. 

# Limitations


The following are limitations for the Slack connector:
+  Record based partitioning is not supported as connector does not provide any means to fetch the total number of records (messages) available in a given conversation. 

# Creating a new Slack account and configuring the client app


**Creating a Slack account**

1. Open the [Slack home page](https://slack.com/intl/en-in/) to sign-up for an account. 

1. Choose **SIGN UP WITH EMAIL ADDRESS**. Enter your email ID and choose **Continue**.

1. Enter the 6-character code sent to your email address, it will redirect you to create a workspace or to join an existing workspace.

1. Choose **Create a workspace** to create a new workspace. It will redirect you to answer a few questions as a part of the set-up process.
   + Name of company
   + Your name
   + To add colleagues by email
   + What's your team working on? (This will be the channel name)

1. Fill in the input fields for these questions and continue. Your account is now ready to be used.



**Creating a Slack developer app**

1. Log in to your Slack account and sign into your Slack workspace.

1. From the workspace menu, select **Tools and settings** and then select **Manage apps**.

1. From the Slack App Directory menu, select **Build**.

1. On the **Your Apps** page, select **Create an App**.

1. On the **Create an app** page, select **From scratch**.

1. In the **Name app & choose workspace** dialog box that opens, add an App name and **Pick a workspace to deploy your app in**. Then select **Create App**.

1. Note down your Client Id and Secret displayed in App Credentials

1. On the OAuth & Permissions sidebar, go to Scopes and choose **Add an OAuth Scope**. You can add the redirect URLs to your app for configuration to automatically generate the 'Add to Slack' button or to distribute your app. Scroll up to the Redirect URLs section and choose **Add New Redirect URL** and save. 

1. Then, scroll to OAuth Tokens for Your Workspace section, and choose **Install to Workspace**.

1. On the dialog box that opens up informing you that the app that you created is requesting permission to access the Slack workspace you wanted to connect it to, select **Allow**.

1. On successful completion, the console will display a OAuth Tokens for Your Workspace screen.

1. From the OAuth Tokens for Your Workspace screen, copy and save the OAuth token you will use to connect to AWS Glue

1. Next, you retrieve your Slack team ID. From the Slack workspace menu, select **Tools and settings** and then select **Manage apps**. You'll find your team ID in the URL of the page that opens.

1. To publicly distribute your app, you can activate by heading over to the **Manage Distribution** button on the sidebar. Scroll down to the Share Your App with Other Workspaces section and choose **Remove Hard Coded Information**. Provide consent and choose **Active Public Distribution**. 

1. Your app is now publicly distributed. To access the entity APIs, the app needs to be added to every workspace channel the user wants to access from.

1. Sign into your slack account and open the workspace whose channel needs to be accessed.

1. In the workspace, open the channel for which the app wants to access and choose the channel title. Select the **Integrations** tab from the pop-up and add the app. In this way, the app is integrated with the channel to have access to its API.

   The OAuth 2.0 client ID must have one or more authorized redirect URLs. Redirect URLs have the following format:
**Note**  
 Appflow redirect URLs are subject to change post redirect URLs for AWS Glue platform are available. Client ID and Client Secret are from the settings for your OAuth 2.0 client ID.     
<a name="slack-redirect-url-detail"></a>[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/slack-new-account-creation.html)

# Connecting to Smartsheet
Connecting to Smartsheet

Smartsheet is a work management and collaboration SaaS product. Fundamentally, Smartsheet allows users to use spreadsheet-like objects to create, store, and utilize business data.

**Topics**
+ [

# AWS Glue support for Smartsheet
](smartsheet-support.md)
+ [

# Policies containing the API operations for creating and using connections
](smartsheet-configuring-iam-permissions.md)
+ [

# Configuring Smartsheet
](smartsheet-configuring.md)
+ [

# Configuring Smartsheet connections
](smartsheet-configuring-connections.md)
+ [

# Reading from Smartsheet entities
](smartsheet-reading-from-entities.md)
+ [

# Smartsheet connection options
](smartsheet-connection-options.md)
+ [

# Creating an Smartsheet account
](smartsheet-create-account.md)
+ [

# Limitations
](smartsheet-connector-limitations.md)

# AWS Glue support for Smartsheet


AWS Glue supports Smartsheet as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Smartsheet.

**Supported as a target?**  
No.

**Supported Smartsheet API versions**  
 v2.0 

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the preceding method, alternatively, use the following managed IAM policies:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring Smartsheet


Before you can use AWS Glue to transfer from Smartsheet, you must meet the following requirements:

## Minimum requirements

+ You have an Smartsheet account with email and password. For more information about creating an account, see [Creating a Smartsheet account](smartsheet-create-account.md). 
+ Your Smartsheet account has API access with valid license.
+ Your Smartsheet account has **Pro pricing** plan for `Sheets` entity and Enterprise pricing plan with Event Reporting Add-On for `Events` entity.

If you meet these requirements, you’re ready to connect AWS Glue to your Smartsheet account. For typical connections, you don't need do anything else in Smartsheet.

# Configuring Smartsheet connections


 Smartsheet supports `AUTHORIZATION_CODE` grant type for OAuth2. 

This grant type is considered “three-legged” `OAuth` as it relies on redirecting users to the third-party authorization server to authenticate the user. Users may opt to create their own connected app in Smartsheet and provide their own client ID and client secret when creating connections through the AWS Glue console. In this scenario, they will still be redirected to Smartsheet to login and authorize AWS Glue to access their resources. 

This grant type results in a refresh token and access token. The access token is short lived, and may be refreshed automatically without user interaction using the refresh token. 

For public Smartsheet documentation on creating a connected app for AUTHORIZATION\$1CODE OAuth flow, see [Smartsheet APIs](https://smartsheet.redoc.ly/#section/OAuth-Walkthrough) . 

To configure a Smartsheet connection:

1. In AWS Secrets Manager, create a secret with the following details: 

   For customer managed connected app – Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key. 
**Note**  
It is a must to create a secret per connection in AWS Glue.

1. In the AWS Glue Studio, create a connection under **Data Connections** by following the steps below: 

   1. When selecting a **Connection type**, select Smartsheet.

   1. Provide the `instanceUrl` of the Smartsheet you want to connect to.

   1. Select the IAM role for which AWS Glue can assume and has permissions for following actions: 

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. 

   1.  Select the network options if you want to use your network. 

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`. 

# Reading from Smartsheet entities


 **Prerequisites** 

A `Smartsheet` Object you would like to read from. Refer the supported entities table below to check the available entities. 

 **Supported entities** 


| Entity | Can be Filtered | Supports Limit | Supports Order By | Supports Select \$1 | Supports Partitioning | 
| --- | --- | --- | --- | --- | --- | 
| List Sheet | Yes | Yes | No | Yes | No | 
| Row Metadata | Yes | Yes | No | Yes | No | 
| Sheet Metadata | No | No | No | Yes | No | 
| Sheet Data | Yes | Yes | Yes | Yes | No | 
| Event | Yes | Yes | No | Yes | No | 

 **Example** 

```
Smartsheet_read = glueContext.create_dynamic_frame.from_options(
    connection_type="smartsheet",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "list-sheets",
        "API_VERSION": "2.0",
        "INSTANCE_URL": "https://api.smartsheet.com"
    })
```

 **Smartsheet entity and field details** 

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/smartsheet-reading-from-entities.html)

**Entities with dynamic metadata:**

For the following entity, Smartsheet provides an endpoint to fetch metadata dynamically, allowing operator support to be captured at the datatype level.


| Entity |  Data Type  | Supported Operators | 
| --- | --- | --- | 
|  Sheet Data  |  String  | NA | 
| Sheet Data |  Long  | "=" | 
| Sheet Data | Integer | NA | 
| Sheet Data | DateTime | > | 

 **Example** 

```
Smartsheet_read = glueContext.create_dynamic_frame.from_options(
    connection_type="smartsheet",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "list-sheets",
        "API_VERSION": "2.0",
        "INSTANCE_URL": "https://api.smartsheet.com"
    }
```

# Smartsheet connection options


The following are connection options for Smartsheet:
+ `ENTITY_NAME`(String) – (Required) Used for Read/Write. The name of your Object in Smartsheet. 
+ `API_VERSION`(String) – (Required) Used for Read/Write. Smartsheet Rest API version you want to use. For example: v2.0. 
+ `INSTANCE_URL`(String) – (Required) Used for Read. Smartsheet instance URL.
+ `SELECTED_FIELDS`(List<String>) – Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object. 
+ `FILTER_PREDICATE`(String) – Default: empty. Used for Read. It should be in the Spark SQL format. 
+ `QUERY`(String) – Default: empty. Used for Read. Full Spark SQL query. 

# Creating an Smartsheet account


1. Sign up for a Smartsheet account by accessing [Smartsheet sing-up page](https://app.smartsheet.com/home). 

1. Choose **Create one** to create a new account, or sign in using your registered Google, Microsoft, or Apple account.

1.   Enter your work email address when prompted.   

1.   Choose **Continue** and if required, verify your identity.  

1. Open the confirmation email from Smartsheet, and choose the confirmation link to verify your account. 

   You will be subscribed to the Trial Plan by default. 

1. In the bottom-left corner, choose the **Account** icon and choose **Add Licenses/Upgrade** to upgrade your pricing plan.
**Note**  
This is required for accessing **Event Reporting**, which is an add-on in the **Enterprise** plan.

1. Under the **Enterprise** plan, choose **Contact Us** to request an account upgrade from the support team.

1. In the support request form, provide the required details and your requirements to upgrade the plan.

   This completes the upgrade to **Enterprise** plan.

**Creating `OAuth2.0` credentials**

1. After upgrading your account’s pricing plan to get access to the **Developer Tools**, access [Smartsheet developers](https://developers.smartsheet.com/). 

   You will receive an activation email.

1. Open an activation email from Smartsheet, and choose the activation link to activate developer tools on your account. 

   Developers tool allows you to create the app.

1. Open the home page of your Smartsheet account and choose **Account** to check for access.

1. Choose **Developer Tools** from the services list, and enter the **Developer Profile** details.

1. Choose **Create New App**.

1. Enter the following details into the app registration form:
   + **Name** – Name of the app.
   + **Description** – Description of the app.
   + **URL** – URL that allows you to launch your app or the URL of the landing page.
   + **Contact/support** – Contact information for the support team.
   + **Redirect URL** – URL (also known as a callback URL) within your application that will receive the [OAuth 2.0 ](https://.console.aws.amazon.com/appflow/oauth)credentials. 

1. Choose **Save**.

   Smartsheet assigns a client ID and client secret to your app. Record these values for the next steps. You can also look them up again later in the **Developer Tools** section.

# Limitations


Smartsheet doesn't support field-based or record-based partitioning.

# Connecting to Snapchat Ads in AWS Glue Studio
Connecting to Snapchat Ads

 Snapchat is a multimedia instant messaging app and service developed by Snap Inc., originally Snapchat Inc. One of the principal features of Snapchat is that pictures and messages are usually only available for a brief time before they become inaccessible to their recipients. Snapchat Marketing are posts for which businesses can pay to serve to Snapchat users. 

**Topics**
+ [

# AWS Glue support for Snapchat Ads
](snapchat-ads-support.md)
+ [

# Policies containing the API operations for creating and using connections
](snapchat-ads-configuring-iam-permissions.md)
+ [

# Configuring Snapchat Ads
](snapchat-ads-configuring.md)
+ [

# Configuring Snapchat Ads connections
](snapchat-ads-configuring-connections.md)
+ [

# Reading from Snapchat Ads entities
](snapchat-ads-reading-from-entities.md)
+ [

# Snapchat Ads connection options
](snapchat-ads-connection-options.md)
+ [

# Creating a Snapchat Ad account and configuring the client app
](connecting-to-data-snapchat-ads-new-account.md)
+ [

# Creating an app in your Snapchat Ads account
](connecting-to-data-snapchat-ads-managed-client-application.md)

# AWS Glue support for Snapchat Ads


AWS Glue supports Snapchat Ads as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Snapchat Ads.

**Supported as a target?**  
No.

**Supported Snapchat Ads API versions**  
 v1. 

# Policies containing the API operations for creating and using connections
IAM policies

 The following sample policy describes the required AWS permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

You can also use the following managed IAM policies to allow access:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, Amazon CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring Snapchat Ads


Before you can use AWS Glue to transfer from Snapchat Ads, you must meet these requirements:

## Minimum requirements

+  You have a Snapchat Ads account. For more information on creating an account, see see [Creating a Snapchat Ad account and configuring the client app](connecting-to-data-snapchat-ads-new-account.md). 
+  You have created an OAuth2 app in your Snapchat Ads account. This integration provides the credentials that AWS Glue uses to access your data securely when it makes authenticated calls to your account. For more information, see [Creating an app in your Snapchat Ads account](connecting-to-data-snapchat-ads-managed-client-application.md). 

 If you meet these requirements, you’re ready to connect AWS Glue to your Snapchat Ads account. In Snapchat Ads, a connected app is a framework that authorizes external applications, like AWS Glue, to access your Snapchat Ads data. 

# Configuring Snapchat Ads connections


 Snapchat Ads supports only the `AUTHORIZATION_CODE` grant type. 

 This grant type is considered “three-legged” OAuth as it relies on redirecting users to the third party authorization server to authenticate the user. It is used when creating connections via the AWS Glue Console. The user creating a connection may by default rely on a AWS Glue owned connected app (AWS Glue managed client application) where they do not need to provide any OAuth related information except for their Snapchat Ads instance URL. The AWS Glue Console will redirect the user to Snapchat Ads where the user must login and allow AWS Glue the requested permissions to access their Snapchat Ads instance. 

 Users may still opt to create their own connected app in Snapchat Ads and provide their own client id and client secret when creating connections through the AWS Glue Console. In this scenario, they will still be redirected to Snapchat Ads to login and authorize AWS Glue to access their resources. 

 This grant type results in a refresh token and access token. The access token expires after 1 hour of creation. A new access token can be fetched using the refresh token. 

 For more information on creating a connected app for Authorization Code OAuth flow, see [ Ads API ](https://marketingapi.snapchat.com/docs/#authentication). 

To configure a Snapchat Ads connection:

1.  In AWS Secrets Manager, create a secret with the following details. It is required to create a secret for each connection in AWS Glue. 

   1.  For customer managed connected app – Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key. 

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below: 

   1. When selecting a **Connection type**, select Snapchat Ads.

   1. Provide the Snapchat Ads environment.

   1.  Select the IAM role which AWS Glue can assume and has permissions for following actions: 

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1.  Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. 

   1.  Select the network options if you want to use your network. 

1.  Grant the IAM role associated with your AWS Glue job permission to read `secretName`. 

# Reading from Snapchat Ads entities


 **Prerequisites** 
+  A Snapchat Ads Object you would like to read from. Refer the supported entities table below to check the available entities. 

 **Supported entities** 


| Entity | Can be Filtered | Supports Limit | Supports Order By | Supports Select \$1 | Supports Partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Organization | No | No | No | Yes | No | 
| Ad Account | No | No | No | Yes | No | 
| Creative | No | No | No | Yes | No | 
| Media | No | No | No | Yes | No | 
| Campaign | Yes | No | No | Yes | No | 
| Ad Under Ad Account | Yes | No | No | Yes | No | 
| Ad Under Campaign | No | No | No | Yes | No | 
| Ad Squad | Yes | No | No | Yes | No | 
| Segment | No | No | No | Yes | No | 

 **Example** 

```
snapchatads_read = glueContext.create_dynamic_frame.from_options(
    connection_type="snapchatAds",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "organization",
        "API_VERSION": "v1"
    }
)
```

 **Snapchat Ads entity and field details** 

 Snapchat Ads dynamically loads available fields under selected entity. Depending on the data type of the field, it supports following filter operators. 


| Field Data Type | Supported Filter Operators | 
| --- | --- | 
| Boolean | = | 

 **Partitioning queries** 
+  Field-based partitioning: Not supported. 
+  Record-based partitioning: Not supported. 

# Snapchat Ads connection options


The following are connection options for Snapchat Ads:
+  `ENTITY_NAME`(String) - (Required) Used for Read. The name of Snapchat Ads entity. Example: ` campaign `. 
+  `API_VERSION`(String) - (Required) Used for Read. Snapchat Ads Rest API version you want to use. The value will be v1, as Snapchat Ads currently supports only version v1. 
+  `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Comma separated list of columns you want to select for the selected entity. 
+  `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format. 
+  `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query. 

# Creating a Snapchat Ad account and configuring the client app
Creating new Snapchat Ad account

**Topics**
+ [

## Sign up for Snapchat Ads
](#snapchat-ads-sign-up)
+ [

## Steps to create a Snapchat Ad account
](#snapchat-ads-create-ad-account)

## Sign up for Snapchat Ads


**To sign up for Snapchat Ads:**

1.  Navigate to [Snapchat Ads Manager](https://ads.snapchat.com/). Choose **Sign Up** next to **New to Snapchat?**. 

1.  On the **Create Account** screen, follow the prompts to enter your Business Name, Email, Password, etc. Choose **Next**. 

1.  On the **Create Your Profile** screen, enter values for User Name, Website (Optional), and choose **Create Account**. This will give you an option to add a profile photo and bio on the **Edit Your Profile** screen. Choose **Confirm**. 

1.  On the **Business Info** screen, fill out the required fields like Country, Currency, Phone Number, GSTIN etc, and complete the account creation process by choosing **Next**. 

## Steps to create a Snapchat Ad account
Steps to create Snapchat Ad account

**To create a Snapchat Ad account:**

1.  Log in to **Ads Manager**. Then click the navigation in the top corner and select **Ad Accounts**. 

1.  Choose **\$1 New Ad Account**. Input your advertiser details: 
   +  Select whether or not you’re an agency buying ads on behalf of an advertiser. If you select ‘Yes’, your ad may be rejected if it uses targeting parameters that could include age, gender, or postal code level targeting. Minimum age targeting may be applied to up to 21 years of age. 
   +  Select whether or not your ad account will run housing, credit, or employment ads. If you select ‘Yes’, your ad may be rejected if it uses targeting parameters that could include age, gender, or postal code level targeting. Minimum age targeting may be applied to up to 21 years of age. 
   +  Select whether you will use the ad account for political ads. If you're running a political ad, input the sponsoring political organization or advocacy group who is paying for the ad. If you do not accurately input the political organization your ads may be rejected. You will also need to fill out the mandatory linked 'Political Ad Review Form' before submitting ads. 

1.  Choose **Account Details** and fill out your ad account info:     
<a name="snapchat-ads-account-details"></a>[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/connecting-to-data-snapchat-ads-new-account.html)

1.  Choose **Create Account**. Your ad account will be created, and you can find it in the Ad Accounts portion of Ads Manager. To begin launching ads, you’ll want to input a payment method. You can also add members to your ad account. 

1.  Select whether you’d like to use an existing payment or create a new one. Then, choose **Save Payment Method**. 

1.  Select any [ members you’ve invited](https://businesshelp.snapchat.com/s/article/manage-members?language=en_US) to your business to add to the ad account. For more information about the roles and permissions that can be assigned, see [ Roles and Permissions Overview](https://businesshelp.snapchat.com/s/article/roles-permissions?language=en_US). Members added will then be able to log in to Ads Manager and access this ad account. When you’re done, save your members. 

 For more information about ad accounts, see [https://businesshelp.snapchat.com/s/article/roles-permissions?language=en_US](https://businesshelp.snapchat.com/s/article/roles-permissions?language=en_US)https://businesshelp.snapchat.com/s/article/roles-permissions?language=en\$1US 

# Creating an app in your Snapchat Ads account
Creating an app in your Snapchat Ads account

 To activate access to Snapchat’s Marketing API, make sure you have a business account set up. Then follow the steps below. 

1.  Log in to Ads Manager. Then choose the menu in the top left corner and select **Business Dashboard**, then select **Business Details**. 

1.  Choose **\$1OAuth App** . 

1.  Enter your App Name and Add following URL as Snap Redirect URI `https://<aws-region>.console.aws.amazon.com/gluestudio/oauth`. For example, if using the us-west-1 region, the URL would be `https://us-west-1.console.aws.amazon.com/gluestudio/oauth) and choose Create OAuth App`. Choose **Create OAuth App**. 

1.  Your app credentials (Client ID and client Secret) will be displayed. Save them as they will be required to create a connection. 

# Connecting to Snowflake in AWS Glue Studio
Connecting to Snowflake

**Note**  
 You can use AWS Glue for Spark to read from and write to tables in Snowflake in AWS Glue 4.0 and later versions. To configure a Snowflake connection with AWS Glue jobs programatically, see [Redshift connections](aws-glue-programming-etl-connect-redshift-home.md). 

 AWS Glue provides built-in support for Snowflake. AWS Glue Studio provides a visual interface to connect to Snowflake, author data integration jobs, and run them on the AWS Glue Studio serverless Spark runtime. 

 AWS Glue Studio creates a unified connection for Snowflake. For more information, see [Considerations](using-connectors-unified-connections.md#using-connectors-unified-connections-considerations). 

**Topics**
+ [

# Creating a Snowflake connection
](creating-snowflake-connection.md)
+ [

# Creating a Snowflake source node
](creating-snowflake-source-node.md)
+ [

# Creating a Snowflake target node
](creating-snowflake-target-node.md)
+ [

# Set up the Authorization Code flow for Snowflake
](snowflake-setup-authorization-code-flow.md)
+ [

## Advanced options
](#creating-snowflake-connection-advanced-options)

# Creating a Snowflake connection


**Note**  
 Unified connections (connection v2) standardize all connections to use `USERNAME`, `PASSWORD` keys for basic auth credentials. You can still create a v1 connection via API with secrets containing `sfUser`, `sfPassword`. 

 When adding a **Data source - Snowflake** node in AWS Glue Studio, you can choose an existing AWS Glue Snowflake connection or create a new connection. You must choose a `SNOWFLAKE` type connection and not a `JDBC` type connection configured to connect to Snowflake. Follow the following procedure to create a AWS Glue Snowflake connection:

**To create a Snowflake connection**

1. In Snowflake, generate a user, *snowflakeUser* and password, *snowflakePassword*. 

1. Determine which Snowflake warehouse this user will interact with, *snowflakeWarehouse*. Either set it as the `DEFAULT_WAREHOUSE` for *snowflakeUser* in Snowflake or remember it for the next step.

1. In AWS Secrets Manager, create a secret using your Snowflake credentials. To create a secret in Secrets Manager, follow the tutorial available in [ Create an AWS Secrets Manager secret ](https://docs.aws.amazon.com/secretsmanager/latest/userguide/create_secret.html#create_secret_cli) in the AWS Secrets Manager documentation. After creating the secret, keep the Secret name, *secretName* for the next step. 
   + When selecting **Key/value pairs**, create a pair for *snowflakeUser* with the key `sfUser`.
   + When selecting **Key/value pairs**, create a pair for *snowflakePassword* with the key `sfPassword`.
   + When selecting **Key/value pairs**, create a pair for *snowflakeWarehouse* with the key `sfWarehouse`. This is not needed if a default is set in Snowflake. 

1. In the AWS Glue Data Catalog, create a connection by following the steps in [Adding an AWS Glue connection](https://docs.aws.amazon.com//glue/latest/dg/console-connections.html). After creating the connection, keep the connection name, *connectionName*, for the next step. 
   + When selecting a **Connection type**, select Snowflake.
   + When selecting **Snowflake URL**, provide the hostname of your Snowflake instance. The URL will use a hostname in the form `account_identifier.snowflakecomputing.com`.
   + When selecting an **AWS Secret**, provide *secretName*.

# Creating a Snowflake source node


## Permissions needed


 AWS Glue Studio jobs using Snowflake data sources require additional permissions. For more information on how to add permissions to ETL jobs, see [Review IAM permissions needed for ETL jobs](https://docs.aws.amazon.com/glue/latest/ug/setting-up.html#getting-started-min-privs-job). 

 `SNOWFLAKE` AWS Glue connections use an AWS Secrets Manager secret to provide credential information. Your job and data preview roles in AWS Glue Studio must have permission to read this secret.

## Adding a Snowflake data source


**Prerequisites**:
+ An AWS Secrets Manager secret for your Snowflake credentials
+ A Snowflake type AWS Glue Data Catalog connection

**To add a **Data Source – Snowflake** node:**

1.  Choose the connection for your Snowflake data source. This assumes that the connection already exists and you can select from existing connections. If you need to create a connection, choose **Create Snowflake connection**. For more information, see [ Overview of using connectors and connections ](https://docs.aws.amazon.com/glue/latest/ug/connectors-chapter.html#using-connectors-overview). 

    Once you have chosen a connection, you can view the connection properties by clicking **View properties**. Information about the connection are visible, including URL, security groups, subnet, availability zone, description, and created (UTC) and last updated (UTC) timestamps. 

1.  Choose a Snowflake source option: 
   +  **Choose a single table** – this is the table that contains the data you want to access from a single Snowflake table. 
   +  **Enter custom query ** – allows you to access a dataset from multiple Snowflake tables based on your custom query. 

1.  If you chose a single table, enter the name of a Snowflake schema. 

    Or, choose **Enter custom query**. Choose this option to access a custom dataset from multiple Snowflake tables. When you choose this option, enter the Snowflake query. 

1.  In **Performance and security** options (optional), 
   +  **Enable query pushdown** – choose if you want to offload work to the Snowflake instance. 

1.  In **Custom Snowflake properties** (optional), enter parameters and values as needed. 

# Creating a Snowflake target node


## Permissions needed


 AWS Glue Studio jobs using Snowflake data sources require additional permissions. For more information on how to add permissions to ETL jobs, see [Review IAM permissions needed for ETL jobs](https://docs.aws.amazon.com/glue/latest/ug/setting-up.html#getting-started-min-privs-job). 

 `SNOWFLAKE` AWS Glue connections use an AWS Secrets Manager secret to provide credential information. Your job and data preview roles in AWS Glue Studio must have permission to read this secret.

## Adding a Snowflake data target


**To create a Snowflake target node:**

1.  Choose an existing Snowflake table as the target, or enter a new table name. 

1.  When you use the **Data target - Snowflake** target node, you can choose from the following options: 
   +  **APPEND** – If a table already exists, dump all the new data into the table as an insert. If the table doesn't exist, create it and then insert all new data. 
   +  **MERGE** – AWS Glue will update or append data to your target table based on the conditions you specify. 

      Choose options: 
     + **Choose keys and simple actions** – choose the columns to be used as matching keys between the source data and your target data set. 

       Specify the following options when matched:
       + Update record in your target data set with data from source.
       + Delete record in your target data set.

       Specify the following options when not matched:
       + Insert source data as a new row into your target data set.
       + Do nothing.
     + **Enter custom MERGE statement** – You can then choose **Validate Merge statement** to verify that the statement is valid or invalid.
   +  **TRUNCATE** – If a table already exists, truncate the table data by first clearing the contents of the target table. If truncate is successful, then insert all data. If the table doesn't exist, create the table and insert all data. If truncate is not successful, the operation will fail. 
   +  **DROP** – If a table already exists, delete the table metadata and data. If deletion is successful, then insert all data. If the table doesn't exist, create the table and insert all data. If drop is not successful, the operation will fail. 

# Set up the Authorization Code flow for Snowflake


To use OAuth authentication method, ensure the following setup is complete:
+ **Configure Snowflake OAuth for a custom client** by following the official Snowflake documentation: [Configure Snowflake OAuth for custom clients.](https://docs.snowflake.com/en/user-guide/oauth-custom) 
+ **Set the correct redirect URI** when creating the Snowflake security integration. For example: If you are creating the connection in the DUB (eu-west-1) region, your redirect URI should be: `https://eu-west-1.console.aws.amazon.com/gluestudio/oauth` 
+ After creating the security integration, retain the following information for use when creating the Glue connection: 
  + OAUTH\$1CLIENT\$1ID: This value should be provided as User Managed Client Application Client ID on the Glue connection creation page.
  + OAUTH\$1CLIENT\$1SECRET: This value should be stored in the AWS Secret used for the connection, under the key USER\$1MANAGED\$1CLIENT\$1APPLICATION\$1CLIENT\$1SECRET.

## Advanced options


See [ Snowflake connections ](https://docs.aws.amazon.com//glue/latest/dg/aws-glue-programming-etl-connect-snowflake-home.html) in the AWS Glue developer guide. 

# Connecting to Stripe in AWS Glue Studio
Connecting to Stripe

 Stripe is an online payment processing and credit card processing platform for businesses. The Stripe platform lets businesses accept online payments, create subscription (recurring billing) for their e-commerce, and sets up back account to receive pay outs. Stripe also supports multi-party payments in which it lets businesses to setup their marketplace and allow to collect payment and then pay out to sellers or service providers through “Connected” account. 

**Topics**
+ [

# AWS Glue support for Stripe
](stripe-support.md)
+ [

# Policies containing the API operations for creating and using connections
](stripe-configuring-iam-permissions.md)
+ [

# Configuring Stripe
](stripe-configuring.md)
+ [

# Configuring Stripe connections
](stripe-configuring-connections.md)
+ [

# Reading from Stripe entities
](stripe-reading-from-entities.md)
+ [

# Stripe connection options
](stripe-connection-options.md)
+ [

# Limitations
](stripe-limitations.md)
+ [

# Creating a new Stripe account and configuring the client app
](stripe-new-account-creation.md)

# AWS Glue support for Stripe


AWS Glue supports Stripe as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Stripe.

**Supported as a target?**  
No.

**Supported Slack API versions**  
 v1. 

# Policies containing the API operations for creating and using connections
IAM policies

 The following sample policy describes the required IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

You can also use the following managed IAM policies to allow access:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, Amazon CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring Stripe


Before you can use AWS Glue to transfer data from Stripe, you must meet these requirements:

## Minimum requirements

+  You must have a Stripe account with email and password. For more information, see [Creating a new Stripe account and configuring the client app](stripe-new-account-creation.md). 
+  Your Stripe account is enabled for API access. All use of the Stripe API is available at no additional cost. 

 If you meet these requirements, you’re ready to connect AWS Glue to your Stripe account. 

# Configuring Stripe connections


 Stripe supports custom authentication. For more information on generating the required API keys for custom authorization, see [STRIPE REST API Documentation](https://docs.stripe.com/api/authentication). 

To configure a Stripe connection:

1.  In AWS Secrets Manager, create a secret with the following details. It is required to create a secret for each connection in AWS Glue. 

   1.  For customer managed connected app – Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key. 

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below: 

   1. When selecting a **Connection type**, select Stripe.

   1.  Select the IAM role which AWS Glue can assume and has permissions for following actions: 

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1.  Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. 

   1.  Select the network options if you want to use your network. 

1.  Grant the IAM role associated with your AWS Glue job permission to read `secretName`. 

# Reading from Stripe entities


 **Prerequisites** 
+  A Stripe object you would like to read from. 

 **Supported entities** 


| Entity | Can be Filtered | Supports Limit | Supports Order By | Supports Select \$1 | Supports Partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Balance | No | No | No | Yes | No | 
| Balance Transactions | Yes | Yes | No | Yes | Yes | 
| Charges | Yes | Yes | No | Yes | Yes | 
| Disputes | Yes | Yes | No | Yes | Yes | 
| File Links | Yes | Yes | No | Yes | Yes | 
| PaymentIntents | Yes | Yes | No | Yes | Yes | 
| SetupIntents | Yes | Yes | No | Yes | Yes | 
| Payouts | Yes | Yes | No | Yes | Yes | 
| Refunds | Yes | Yes | No | Yes | Yes | 
| Products | Yes | Yes | No | Yes | Yes | 
| Prices | Yes | Yes | No | Yes | Yes | 
| Coupons | Yes | Yes | No | Yes | Yes | 
| Promotion Codes | Yes | Yes | No | Yes | Yes | 
| Tax Codes | No | Yes | No | Yes | No | 
| Tax Rates | Yes | Yes | No | Yes | Yes | 
| Shipping Rates | Yes | Yes | No | Yes | Yes | 
| Sessions | Yes | Yes | No | Yes | Yes | 
| Credit Notes | Yes | Yes | No | Yes | Yes | 
| Customer | Yes | Yes | No | Yes | Yes | 
| Invoices | Yes | Yes | No | Yes | Yes | 
| Invoice Items | Yes | Yes | No | Yes | No | 
| Plans | Yes | Yes | No | Yes | Yes | 
| Quotes | Yes | Yes | No | Yes | No | 
| Subscriptions | Yes | Yes | No | Yes |  | 
| Subscription Items | No | Yes | No | Yes | No | 
| Subscription Schedules | Yes | Yes | No | Yes | Yes | 
| Accounts | No | Yes | No | Yes | Yes | 
| Application Fees | Yes | Yes | No | Yes | Yes | 
| Country Specs | No | Yes | No | Yes | No | 
| Transfers | Yes | Yes | No | Yes | Yes | 
| Early Fraud Warnings | Yes | Yes | No | Yes | Yes | 
| Report Types | No | No | No | Yes | No | 

 **Example** 

```
stripe_read = glueContext.create_dynamic_frame.from_options(
    connection_type="stripe",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "coupons",
        "API_VERSION": "v1"
    }
)
```

 **Stripe entity and field details** 


| Entity | Field | Data Type | Supported Operators | 
| --- | --- | --- | --- | 
| Balance |  |  |  | 
|  | available | List |  | 
|  | connect\$1reserved | List |  | 
|  | pending | List |  | 
|  | livemode | Boolean |  | 
|  | object | String |  | 
|  | instant\$1available | List |  | 
|  | issuing | Struct |  | 
| Balance Transactions |  |  |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | amount | Integer |  | 
|  | available\$1on | DateTime | =, >=, <=,<,> | 
|  | created | DateTime | =, >=, <=,<,> | 
|  | currency | String |  | 
|  | description | String |  | 
|  | exchange\$1rate | BigDecimal |  | 
|  | fee | Integer |  | 
|  | fee\$1details | List |  | 
|  | net | Integer |  | 
|  | reporting\$1category | String |  | 
|  | source | String | = | 
|  | status | String |  | 
|  | type | String | = | 
|  | cross\$1border\$1classification | String |  | 
| Charges |  |  |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | amount | Integer | =, <, > | 
|  | amount\$1captured | Integer |  | 
|  | amount\$1refunded | Integer |  | 
|  | application | String |  | 
|  | application\$1fee | String |  | 
|  | application\$1fee\$1amount | Integer |  | 
|  | balance\$1transaction | String |  | 
|  | billing\$1details | Struct |  | 
|  | calculated\$1statement\$1descriptor | String |  | 
|  | captured | Boolean |  | 
|  | created | DateTime | =, >=, <=,<,> | 
|  | currency | String |  | 
|  | customer | String | = | 
|  | description | String |  | 
|  | destination | String |  | 
|  | dispute | String |  | 
|  | disputed | Boolean | = | 
|  | failure\$1balance\$1transaction | String |  | 
|  | failure\$1code | String |  | 
|  | failure\$1message | String |  | 
|  | fraud\$1details | Struct |  | 
|  | invoice | String |  | 
|  | livemode | Boolean |  | 
|  | metadata | Struct |  | 
|  | on\$1behalf\$1of | String |  | 
|  | order | String |  | 
|  | outcome | Struct |  | 
|  | paid | Boolean |  | 
|  | payment\$1intent | String | = | 
|  | payment\$1method | String |  | 
|  | payment\$1method\$1details | Struct |  | 
|  | receipt\$1email | String |  | 
|  | receipt\$1number | String |  | 
|  | receipt\$1url | String |  | 
|  | refunded | Boolean | = | 
|  | refunds | Struct |  | 
|  | review | String |  | 
|  | shipping | Struct |  | 
|  | source | Struct |  | 
|  | source\$1transfer | String |  | 
|  | statement\$1descriptor | String |  | 
|  | statement\$1descriptor\$1suffix | String |  | 
|  | status | String |  | 
|  | transfer | String |  | 
|  | transfer\$1data | Struct |  | 
|  | transfer\$1group | String | = | 
| Disputes |  |  |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | amount | Integer | =, <, > | 
|  | balance\$1transaction | String |  | 
|  | balance\$1transactions | List |  | 
|  | charge | String | = | 
|  | created | DateTime | =, >=, <=,<,> | 
|  | currency | String |  | 
|  | evidence | Struct |  | 
|  | evidence\$1details | Struct |  | 
|  | is\$1charge\$1refundable | Boolean |  | 
|  | livemode | Boolean |  | 
|  | metadata | Struct |  | 
|  | payment\$1intent | String | = | 
|  | reason | String | = | 
|  | status | String |  | 
|  | payment\$1method\$1details | Struct |  | 
| File Links |  |  |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | created | DateTime | =, >=, <=,<,> | 
|  | expired | Boolean | = | 
|  | expires\$1at | DateTime |  | 
|  | file | String | = | 
|  | livemode | Boolean |  | 
|  | metadata | Struct |  | 
|  | url | String |  | 
| PaymentIntents |  |  |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | amount | Integer |  | 
|  | amount\$1capturable | Integer |  | 
|  | amount\$1details | Struct |  | 
|  | amount\$1received | Integer |  | 
|  | application | String |  | 
|  | application\$1fee\$1amount | Integer |  | 
|  | automatic\$1payment\$1methods | Struct |  | 
|  | canceled\$1at | DateTime |  | 
|  | cancellation\$1reason | String |  | 
|  | capture\$1method | String |  | 
|  | client\$1secret | String |  | 
|  | confirmation\$1method | String |  | 
|  | created | DateTime | =, >=, <=,<,> | 
|  | currency | String |  | 
|  | customer | String | = | 
|  | description | String |  | 
|  | invoice | String |  | 
|  | last\$1payment\$1error | Struct |  | 
|  | latest\$1charge | String |  | 
|  | livemode | Boolean |  | 
|  | metadata | Struct |  | 
|  | next\$1action | Struct |  | 
|  | on\$1behalf\$1of | String |  | 
|  | payment\$1method | String |  | 
|  | payment\$1method\$1options | Struct |  | 
|  | payment\$1method\$1types | List |  | 
|  | payment\$1method\$1configuration\$1details | Struct |  | 
|  | processing | Struct |  | 
|  | receipt\$1email | String |  | 
|  | review | String |  | 
|  | setup\$1future\$1usage | String |  | 
|  | shipping | Struct |  | 
|  | source | String |  | 
|  | statement\$1descriptor | String |  | 
|  | statement\$1descriptor\$1suffix | String |  | 
|  | status | String |  | 
|  | transfer\$1data | Struct |  | 
|  | transfer\$1group | String |  | 
| SetupIntents |  |  |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | application | String |  | 
|  | cancellation\$1reason | String |  | 
|  | client\$1secret | String |  | 
|  | created | DateTime | =, >=, <=,<,> | 
|  | customer | String | = | 
|  | description | String |  | 
|  | flow\$1directions | List |  | 
|  | last\$1setup\$1error | Struct |  | 
|  | latest\$1attempt | String |  | 
|  | livemode | Boolean |  | 
|  | mandate | String |  | 
|  | metadata | Struct |  | 
|  | next\$1action | Struct |  | 
|  | on\$1behalf\$1of | String |  | 
|  | payment\$1method | String |  | 
|  | payment\$1method\$1options | Struct |  | 
|  | payment\$1method\$1types | List |  | 
|  | single\$1use\$1mandate | String |  | 
|  | status | String |  | 
|  | usage | String |  | 
|  | automatic\$1payment\$1methods | Struct |  | 
| Payouts |  |  |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | amount | Integer | =, <, > | 
|  | arrival\$1date | DateTime | =, >=, <=,<,> | 
|  | automatic | Boolean |  | 
|  | balance\$1transaction | String |  | 
|  | created | DateTime | =, >=, <=,<,> | 
|  | currency | String |  | 
|  | description | String | = | 
|  | destination | String |  | 
|  | failure\$1balance\$1transaction | String |  | 
|  | failure\$1code | String |  | 
|  | failure\$1message | String |  | 
|  | livemode | Boolean |  | 
|  | metadata | Struct |  | 
|  | method | String |  | 
|  | original\$1payout | String |  | 
|  | reversed\$1by | String |  | 
|  | reconciliation\$1status | String |  | 
|  | source\$1type | String |  | 
|  | statement\$1descriptor | String |  | 
|  | status | String |  | 
|  | type | String |  | 
|  | application\$1fee | String |  | 
|  | application\$1fee\$1amount | Integer |  | 
| Refunds |  |  |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | amount | Integer |  | 
|  | balance\$1transaction | String |  | 
|  | charge | String | = | 
|  | created | DateTime | =, >=, <=,<,> | 
|  | currency | String |  | 
|  | metadata | Struct |  | 
|  | destination\$1details | Struct |  | 
|  | payment\$1intent | String | = | 
|  | reason | String |  | 
|  | receipt\$1number | String |  | 
|  | source\$1transfer\$1reversal | String |  | 
|  | status | String |  | 
|  | transfer\$1reversal | String |  | 
| Products |  |  |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | active | Boolean | = | 
|  | attributes | List |  | 
|  | created | DateTime | =, >=, <=,<,> | 
|  | default\$1price | String |  | 
|  | description | String |  | 
|  | images | List |  | 
|  | livemode | Boolean |  | 
|  | metadata | Struct |  | 
|  | name | String |  | 
|  | package\$1dimensions | Struct |  | 
|  | shippable | Boolean |  | 
|  | statement\$1descriptor | String |  | 
|  | tax\$1code | String |  | 
|  | type | String | = | 
|  | unit\$1label | String |  | 
|  | updated | DateTime |  | 
|  | url | String |  | 
|  | features | List |  | 
| Prices |  |  |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | active | Boolean | = | 
|  | billing\$1scheme | String |  | 
|  | created | DateTime | =, >=, <=,<,> | 
|  | currency | String | = | 
|  | custom\$1unit\$1amount | Struct |  | 
|  | livemode | Boolean |  | 
|  | lookup\$1key | String |  | 
|  | metadata | Struct |  | 
|  | nickname | String |  | 
|  | product | String | = | 
|  | recurring | Struct |  | 
|  | tax\$1behavior | String |  | 
|  | tiers\$1mode | String |  | 
|  | transform\$1quantity | Struct |  | 
|  | type | String | = | 
|  | unit\$1amount | Integer |  | 
|  | unit\$1amount\$1decimal | String |  | 
| Coupons |  |  |  | 
|  | Id | String |  | 
|  | object | String |  | 
|  | amount\$1off | Integer |  | 
|  | created | DateTime | =, >=, <=,<,> | 
|  | currency | String | = | 
|  | duration | String | = | 
|  | duration\$1in\$1months | Integer | =,<,> | 
|  | livemode | Boolean |  | 
|  | max\$1redemptions | Integer | =, <, > | 
|  | metadata | Struct |  | 
|  | name | String |  | 
|  | percent\$1off | Double | = | 
|  | redeem\$1by | DateTime | =, >=, <=, <, > | 
|  | times\$1redeemed | Integer |  | 
|  | valid | Boolean |  | 
| Promotion Codes |  |  |  | 
|  | Id | String |  | 
|  | object | String |  | 
|  | active | Boolean | = | 
|  | code | String | = | 
|  | coupon | Struct |  | 
|  | created | DateTime | =,>=,<=,<,> | 
|  | customer | String |  | 
|  | expires\$1at | DateTime |  | 
|  | livemode | Boolean |  | 
|  | max\$1redemptions | Integer |  | 
|  | metadata | Struct |  | 
|  | restrictions | Struct |  | 
|  | times\$1redeemed | Integer |  | 
| Tax Codes |  |  |  | 
|  | Id | String |  | 
|  | object | String |  | 
|  | description | String |  | 
|  | name | String |  | 
| Tax Rates |  |  |  | 
|  | Id | String |  | 
|  | object | String |  | 
|  | active | Boolean | = | 
|  | country | String |  | 
|  | created | DateTime | =, >=, <=, <, > | 
|  | description | String |  | 
|  | display\$1name | String |  | 
|  | inclusive | Boolean | = | 
|  | jurisdiction | String |  | 
|  | jurisdiction\$1level | String |  | 
|  | livemode | Boolean |  | 
|  | metadata | Struct |  | 
|  | percentage | Double |  | 
|  | effective\$1percentage | Double |  | 
|  | state | String |  | 
|  | tax\$1type | String |  | 
| Shipping Rates |  |  |  | 
|  | Id | String |  | 
|  | object | String |  | 
|  | active | Boolean | = | 
|  | created | DateTime | =, >=, <=, <, > | 
|  | delivery\$1estimate | Struct |  | 
|  | display\$1name | String |  | 
|  | fixed\$1amount | Struct |  | 
|  | livemode | Boolean |  | 
|  | metadata | Struct |  | 
|  | tax\$1behavior | String |  | 
|  | tax\$1code | String |  | 
|  | type | String |  | 
| Sessions |  |  |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | after\$1expiration | Struct |  | 
|  | allow\$1promotion\$1codes | Boolean |  | 
|  | amount\$1subtotal | Integer |  | 
|  | amount\$1total | Integer |  | 
|  | automatic\$1tax | Struct |  | 
|  | billing\$1address\$1collection | String |  | 
|  | cancel\$1url | String |  | 
|  | client\$1reference\$1id | String |  | 
|  | consent | Struct |  | 
|  | consent\$1collection | Struct |  | 
|  | created | DateTime | =, >=, <=, <, > | 
|  | currency | String |  | 
|  | custom\$1text | Struct |  | 
|  | customer | String |  | 
|  | customer\$1creation | String |  | 
|  | customer\$1details | Struct |  | 
|  | customer\$1email | String |  | 
|  | expires\$1at | DateTime |  | 
|  | invoice | String |  | 
|  | invoice\$1creation | Struct |  | 
|  | livemode | Boolean |  | 
|  | locale | String |  | 
|  | metadata | Struct |  | 
|  | mode | String |  | 
|  | payment\$1intent | String | = | 
|  | payment\$1link | String |  | 
|  | payment\$1method\$1collection | String |  | 
|  | payment\$1method\$1options | Struct |  | 
|  | payment\$1method\$1types | List |  | 
|  | payment\$1status | String |  | 
|  | phone\$1number\$1collection | Struct |  | 
|  | recovered\$1from | String |  | 
|  | setup\$1intent | String |  | 
|  | shipping\$1address\$1collection | Struct |  | 
|  | shipping\$1cost | Struct |  | 
|  | shipping\$1details | Struct |  | 
|  | shipping\$1options | List |  | 
|  | status | String |  | 
|  | submit\$1type | String |  | 
|  | subscription | String |  | 
|  | success\$1url | String |  | 
|  | tax\$1id\$1collection | Struct |  | 
|  | total\$1details | Struct |  | 
|  | url | String |  | 
|  | ui\$1mode | String |  | 
| Credit Notes |  |  |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | amount | Integer |  | 
|  | created | DateTime | =, >=, <=, <, > | 
|  | currency | String |  | 
|  | customer | String | = | 
|  | customer\$1balance\$1transaction | String |  | 
|  | discount\$1amount | Integer |  | 
|  | discount\$1amounts | List |  | 
|  | invoice | String | = | 
|  | lines | Struct |  | 
|  | livemode | Boolean |  | 
|  | memo | String |  | 
|  | metadata | Struct |  | 
|  | number | String |  | 
|  | out\$1of\$1band\$1amount | Integer |  | 
|  | pdf | String |  | 
|  | reason | String |  | 
|  | refund | String |  | 
|  | status | String |  | 
|  | subtotal | Integer |  | 
|  | subtotal\$1excluding\$1tax | Integer |  | 
|  | tax\$1amounts | List |  | 
|  | total | Integer |  | 
|  | total\$1excluding\$1tax | Integer |  | 
|  | type | String |  | 
|  | voided\$1at | DateTime |  | 
|  | amount\$1shipping | Integer |  | 
|  | effective\$1at | DateTime |  | 
|  | shipping\$1cost | Struct |  | 
| Customer |  |  |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | address | Struct |  | 
|  | balance | Integer |  | 
|  | created | DateTime |  | 
|  | currency | String | =, >=, <=, <, > | 
|  | default\$1source | String |  | 
|  | delinquent | Boolean | = | 
|  | description | String |  | 
|  | discount | Struct |  | 
|  | email | String | = | 
|  | invoice\$1prefix | String |  | 
|  | invoice\$1settings | Struct |  | 
|  | livemode | Boolean |  | 
|  | metadata | Struct |  | 
|  | name | String |  | 
|  | next\$1invoice\$1sequence | Integer |  | 
|  | phone | String |  | 
|  | preferred\$1locales | List |  | 
|  | shipping | Struct |  | 
|  | tax\$1exempt | String |  | 
|  | test\$1clock | String |  | 
| Invoices |  |  |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | account\$1country | String |  | 
|  | account\$1name | String |  | 
|  | account\$1tax\$1ids | List |  | 
|  | amount\$1due | Integer |  | 
|  | amount\$1paid | Integer |  | 
|  | amount\$1remaining | Integer |  | 
|  | application | String |  | 
|  | application\$1fee\$1amount | Integer |  | 
|  | attempt\$1count | Integer |  | 
|  | attempted | Boolean | = | 
|  | auto\$1advance | Boolean | = | 
|  | automatic\$1tax | Struct |  | 
|  | billing\$1reason | String |  | 
|  | charge | String |  | 
|  | collection\$1method | String | = | 
|  | created | DateTime | =, >=, <=, <, > | 
|  | currency | String |  | 
|  | custom\$1fields | List |  | 
|  | customer | String | = | 
|  | customer\$1address | Struct |  | 
|  | customer\$1email | String |  | 
|  | customer\$1name | String |  | 
|  | customer\$1phone | String |  | 
|  | customer\$1shipping | Struct |  | 
|  | customer\$1tax\$1exempt | String |  | 
|  | customer\$1tax\$1ids | List |  | 
|  | default\$1payment\$1method | String |  | 
|  | default\$1source | String |  | 
|  | default\$1tax\$1rates | List |  | 
|  | description | String |  | 
|  | discount | Struct |  | 
|  | discounts | List |  | 
|  | due\$1date | DateTime | =, >=, <=, <, > | 
|  | ending\$1balance | Integer |  | 
|  | footer | String |  | 
|  | from\$1invoice | Struct |  | 
|  | hosted\$1invoice\$1url | String |  | 
|  | invoice\$1pdf | String |  | 
|  | last\$1finalization\$1error | Struct |  | 
|  | latest\$1revision | String |  | 
|  | lines | Struct |  | 
|  | livemode | Boolean |  | 
|  | metadata | Struct |  | 
|  | next\$1payment\$1attempt | DateTime |  | 
|  | number | String |  | 
|  | on\$1behalf\$1of | String |  | 
|  | paid | Boolean | = | 
|  | paid\$1out\$1of\$1band | Boolean |  | 
|  | payment\$1intent | String |  | 
|  | payment\$1settings | Struct |  | 
|  | period\$1end | DateTime | =, >=, <=, <, > | 
|  | period\$1start | DateTime | =, >=, <=, <, > | 
|  | post\$1payment\$1credit\$1notes\$1amount | Integer |  | 
|  | pre\$1payment\$1credit\$1notes\$1amount | Integer |  | 
|  | quote | String |  | 
|  | receipt\$1number | String |  | 
|  | rendering | Struct |  | 
|  | rendering\$1options | Struct |  | 
|  | starting\$1balance | Integer |  | 
|  | statement\$1descriptor | String |  | 
|  | status | String | = | 
|  | status\$1transitions | Struct |  | 
|  | subscription | String |  | 
|  | subscription\$1details | Struct |  | 
|  | subtotal | Integer | =, <, > | 
|  | subtotal\$1excluding\$1tax | Integer |  | 
|  | tax | Integer |  | 
|  | test\$1clock | String |  | 
|  | total | Integer | =, <, > | 
|  | total\$1discount\$1amounts | List |  | 
|  | total\$1excluding\$1tax | Integer |  | 
|  | total\$1tax\$1amounts | List |  | 
|  | transfer\$1data | Struct |  | 
|  | webhooks\$1delivered\$1at | DateTime |  | 
|  | automatically\$1finalizes\$1at | DateTime |  | 
|  | effective\$1at | DateTime |  | 
|  | issuer | Struct |  | 
| Invoice Items |  |  |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | amount | Integer | =, <, > | 
|  | currency | String |  | 
|  | customer | String | = | 
|  | date | DateTime |  | 
|  | description | String |  | 
|  | discountable | Boolean |  | 
|  | discounts | List |  | 
|  | invoice | String | = | 
|  | livemode | Boolean |  | 
|  | metadata | Struct |  | 
|  | period | Struct |  | 
|  | plan | Struct |  | 
|  | price | Struct |  | 
|  | proration | Boolean | = | 
|  | quantity | Integer |  | 
|  | subscription | String |  | 
|  | subscription\$1item | String |  | 
|  | tax\$1rates | List |  | 
|  | test\$1clock | String |  | 
|  | unit\$1amount | Integer |  | 
|  | unit\$1amount\$1decimal | String |  | 
| Plans |  |  |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | active | Boolean | = | 
|  | aggregate\$1usage | String |  | 
|  | amount | Integer |  | 
|  | amount\$1decimal | String |  | 
|  | billing\$1scheme | String |  | 
|  | created | DateTime | =, >=, <=, <, > | 
|  | currency | String | = | 
|  | interval | String | = | 
|  | interval\$1count | Integer |  | 
|  | livemode | Boolean |  | 
|  | metadata | Struct |  | 
|  | nickname | String |  | 
|  | product | String | = | 
|  | tiers\$1mode | String |  | 
|  | transform\$1usage | Struct |  | 
|  | trial\$1period\$1days | Integer | =, <, > | 
|  | usage\$1type | String |  | 
|  | meter | String |  | 
| Quotes |  |  |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | amount\$1subtotal | Integer |  | 
|  | amount\$1total | Integer |  | 
|  | application | String |  | 
|  | application\$1fee\$1amount | Integer |  | 
|  | application\$1fee\$1percent | Double |  | 
|  | automatic\$1tax | Struct |  | 
|  | collection\$1method | String |  | 
|  | computed | Struct |  | 
|  | created | DateTime |  | 
|  | currency | String |  | 
|  | customer | String | = | 
|  | default\$1tax\$1rates | List |  | 
|  | description | String |  | 
|  | discounts | List |  | 
|  | expires\$1at | DateTime |  | 
|  | footer | String |  | 
|  | from\$1quote | Struct |  | 
|  | header | String |  | 
|  | invoice | String |  | 
|  | invoice\$1settings | Struct |  | 
|  | livemode | Boolean |  | 
|  | metadata | Struct |  | 
|  | number | String |  | 
|  | on\$1behalf\$1of | String |  | 
|  | status | String | = | 
|  | status\$1transitions | Struct |  | 
|  | subscription | String |  | 
|  | subscription\$1data | Struct |  | 
|  | subscription\$1schedule | String |  | 
|  | test\$1clock | String |  | 
|  | total\$1details | Struct |  | 
|  | transfer\$1data | Struct |  | 
| Subscriptions |  |  |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | application | String |  | 
|  | application\$1fee\$1percent | Double |  | 
|  | automatic\$1tax | Struct |  | 
|  | billing\$1cycle\$1anchor | DateTime |  | 
|  | billing\$1thresholds | Struct |  | 
|  | cancel\$1at | DateTime |  | 
|  | cancel\$1at\$1period\$1end | Boolean |  | 
|  | canceled\$1at | DateTime |  | 
|  | collection\$1method | String | = | 
|  | created | DateTime | =, >=, <=,<,> | 
|  | currency | String |  | 
|  | current\$1period\$1end | DateTime | =, >=, <= | 
|  | current\$1period\$1start | DateTime | =, >=, <= | 
|  | customer | String | = | 
|  | days\$1until\$1due | Integer |  | 
|  | default\$1payment\$1method | String |  | 
|  | default\$1source | String |  | 
|  | default\$1tax\$1rates | List |  | 
|  | description | String |  | 
|  | discount | Struct |  | 
|  | ended\$1at | DateTime |  | 
|  | items | Struct |  | 
|  | latest\$1invoice | String |  | 
|  | livemode | Boolean |  | 
|  | metadata | Struct |  | 
|  | next\$1pending\$1invoice\$1item\$1invoice | DateTime |  | 
|  | pause\$1collection | Struct |  | 
|  | payment\$1settings | Struct |  | 
|  | pending\$1invoice\$1item\$1interval | Struct |  | 
|  | pending\$1setup\$1intent | String |  | 
|  | pending\$1update | Struct |  | 
|  | plan | Struct |  | 
|  | quantity | Integer |  | 
|  | schedule | String |  | 
|  | start\$1date | DateTime |  | 
|  | status | String | = | 
|  | test\$1clock | String |  | 
|  | transfer\$1data | Struct |  | 
|  | trial\$1end | DateTime |  | 
|  | trial\$1start | DateTime |  | 
| Subscription Items |  |  |  | 
|  | Id | String |  | 
|  | object | String |  | 
|  | billing\$1thresholds | Struct |  | 
|  | created | DateTime | =, >=, <=, <, > | 
|  | metadata | Struct |  | 
|  | plan | Struct |  | 
|  | price | Struct |  | 
|  | subscription | String |  | 
|  | tax\$1rates | List |  | 
|  | discounts | List |  | 
| Subscription Schedules |  |  |  | 
|  | object | String |  | 
|  | application | String |  | 
|  | canceled\$1at | DateTime |  | 
|  | completed\$1at | DateTime |  | 
|  | created | DateTime |  | 
|  | current\$1phase | Struct |  | 
|  | customer | String | = | 
|  | default\$1settings | Struct |  | 
|  | end\$1behavior | String |  | 
|  | livemode | Boolean |  | 
|  | metadata | Struct |  | 
|  | phases | List |  | 
|  | released\$1at | DateTime |  | 
|  | released\$1subscription | String |  | 
|  | renewal\$1interval | String |  | 
|  | status | String |  | 
|  | subscription | String |  | 
|  | test\$1clock | String |  | 
| Accounts |  |  |  | 
|  | details\$1submitted | Boolean |  | 
|  | tos\$1acceptance | Struct |  | 
|  | type | String |  | 
|  | metadata | Struct |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | default\$1currency | String |  | 
|  | capabilities | Struct |  | 
|  | charges\$1enabled | Boolean |  | 
|  | settings | Struct |  | 
|  | requirements | Struct |  | 
|  | payouts\$1enabled | Boolean |  | 
|  | future\$1requirements | Struct |  | 
|  | external\$1accounts | Struct |  | 
|  | controller | Struct |  | 
|  | country | String |  | 
|  | email | String |  | 
|  | created | DateTime | =, >=, <=, <, > | 
|  | business\$1profile | Struct |  | 
|  | business\$1type | String |  | 
|  | company | Struct |  | 
| Application Fees |  |  |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | account | String |  | 
|  | amount | Integer | =, <, > | 
|  | amount\$1refunded | Integer | =, <, > | 
|  | application | String |  | 
|  | balance\$1transaction | String |  | 
|  | charge | String | = | 
|  | created | DateTime |  | 
|  | currency | String |  | 
|  | livemode | Boolean |  | 
|  | originating\$1transaction | String |  | 
|  | refunded | Boolean | = | 
|  | refunds | Struct |  | 
|  | fee\$1source | Struct |  | 
| Country Specs |  |  |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | default\$1currency | String |  | 
|  | supported\$1bank\$1account\$1currencies | Struct |  | 
|  | supported\$1payment\$1currencies | List |  | 
|  | supported\$1payment\$1methods | List |  | 
|  | supported\$1transfer\$1countries | List |  | 
|  | verification\$1fields | Struct |  | 
| Transfers |  |  |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | amount | Integer | =, <, > | 
|  | amount\$1reversed | Integer |  | 
|  | balance\$1transaction | String |  | 
|  | created | DateTime | =, >=, <=, <, > | 
|  | currency | String | = | 
|  | description | String |  | 
|  | destination | String | = | 
|  | destination\$1payment | String |  | 
|  | livemode | Boolean |  | 
|  | metadata | Struct |  | 
|  | reversals | Struct |  | 
|  | reversed | Boolean |  | 
|  | source\$1transaction | String |  | 
|  | source\$1type | String |  | 
|  | transfer\$1group | String | = | 
| Early Fraud Warnings |  |  |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | actionable | Boolean |  | 
|  | charge | String | = | 
|  | created | DateTime | =, >=, <=, <, > | 
|  | fraud\$1type | String |  | 
|  | livemode | Boolean |  | 
|  | payment\$1intent | String | = | 
| Report Types |  |  |  | 
|  | id | String |  | 
|  | object | String |  | 
|  | data\$1available\$1end | DateTime |  | 
|  | data\$1available\$1start | DateTime |  | 
|  | default\$1columns | List |  | 
|  | livemode | Boolean |  | 
|  | name | String |  | 
|  | updated | DateTime |  | 
|  | version | Integer |  | 

 **Partitioning queries** 

 Additional spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, `NUM_PARTITIONS` can be provided if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by spark tasks concurrently. 
+  `PARTITION_FIELD`: the name of the field to be used to partition query. 
+  `LOWER_BOUND`: an inclusive lower bound value of the chosen partition field. 

   For date, we accept the Spark date format used in Spark SQL queries. Example of valid value: `"2024-07-01T00:00:00.000Z"`. 
+  `UPPER_BOUND`: an exclusive upper bound value of the chosen partition field. 
+  `NUM_PARTITIONS`: number of partitions. 

 Entity wise partitioning field support details are captured in below table. 


| Entity Name | Partitioning Field | Data Type | 
| --- | --- | --- | 
| Balance Transactions | created | DateTime | 
| Charges | created | DateTime | 
| Disputes | created | DateTime | 
| File Links | created | DateTime | 
| PaymentIntents | created | DateTime | 
| SetupIntents | created | DateTime | 
| Payouts | created | DateTime | 
| Refunds | created | DateTime | 
| Products | created | DateTime | 
| Prices | created | DateTime | 
| Coupons | created | DateTime | 
| Promotion Codes | created | DateTime | 
| Tax Rates | created | DateTime | 
| Shipping Rates | created | DateTime | 
| Sessions | created | DateTime | 
| Credit Notes | created | DateTime | 
| Customer | created | DateTime | 
| Invoices | created | DateTime | 
| Plans | created | DateTime | 
| Subscriptions | created | DateTime | 
| Subscription Schedules | created | DateTime | 
| Accounts | created | DateTime | 
| Application Fees | created | DateTime | 
| Transfers | created | DateTime | 
| Early Fraud Warnings | created | DateTime | 

 **Example** 

```
stripe_read = glueContext.create_dynamic_frame.from_options(
    connection_type="stripe",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "coupons",
        "API_VERSION": "v1",
        "PARTITION_FIELD": "created"
        "LOWER_BOUND": "2024-05-01T20:55:02.000Z"
        "UPPER_BOUND": "2024-07-11T20:55:02.000Z"
        "NUM_PARTITIONS": "10"
    }
)
```

# Stripe connection options


The following are connection options for Stripe:
+  `ENTITY_NAME`(String) - (Required) Used for Read/Write. The name of your Object in Stripe. 
+  `API_VERSION`(String) - (Required) Used for Read/Write. Stripe Rest API version you want to use. Example: v1. 
+  `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object. 
+  `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format. 
+  `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query. 
+  `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query. 
+  `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field. 
+  `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field. 
+  `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read. 

# Limitations


The following are limitations for the Stripe connector:
+  Only Field Based Partitioning supported by connector. 
+  Record Based Partitioning not supported by connector, no provision to retrieve the total count of records. 
+  Primary key datatype is String, so Id Based Partitioning does not support by connector. 

# Creating a new Stripe account and configuring the client app


**Creating a Stripe account**

1. Choose on the link **https://dashboard.stripe.com/register**.

1. Enter your Email, Full name, Password & choose **Create Account**.

1. After login with account, verify the account by choosing **Open Gmail**.

1. Verify the account by clicking on the verification link received on email.

1. After clicking on verify email address, it will redirect to another page

1. After clicking on **Activate payments** to activate the account, it will redirect to Activate payments (**https://dashboard.stripe.com/welcome**) page and make sure to fill all your valid details and after that choose **Continue** button.



**Creating a Slack developer app**

1. Log in to [Stripe](https://dashboard.stripe.com/login).

1. Choose **Developers** as displayed on extreme top of the picture below.

1. Choose **API keys** under Developers.

1. Choose **Reveal test key** to get the API keys.

# Connecting to Teradata Vantage in AWS Glue Studio
Connecting to Teradata

 AWS Glue provides built-in support for Teradata Vantage. AWS Glue Studio provides a visual interface to connect to Teradata, author data integration jobs, and run them on the AWS Glue Studio serverless Spark runtime. 

 AWS Glue Studio creates a unified connection for Teradata Vantage. For more information, see [Considerations](using-connectors-unified-connections.md#using-connectors-unified-connections-considerations). 

**Topics**
+ [

# Creating a Teradata Vantage connection
](creating-teradata-connection.md)
+ [

# Creating a Teradata source node
](creating-teradata-source-node.md)
+ [

# Creating a Teradata target node
](creating-teradata-target-node.md)
+ [

## Advanced options
](#creating-teradata-connection-advanced-options)

# Creating a Teradata Vantage connection


To connect to Teradata Vantage from AWS Glue, you will need to create and store your Teradata credentials in an AWS Secrets Manager secret, then associate that secret with a AWS Glue Teradata connection.

**Prerequisites**:
+ If you are accessing your Teradata environment through Amazon VPC, configure Amazon VPC to allow your AWS Glue job to communicate with the Teradata environment. We discourage accessing the Teradata environment over the public internet.

  In Amazon VPC, identify or create a **VPC**, **Subnet** and **Security group** that AWS Glue will use while executing the job. Additionally, you need to ensure Amazon VPC is configured to permit network traffic between your Teradata instance and this location. Your job will need to establish a TCP connection with your Teradata client port. For more information about Teradata ports, see the [Teradata documentation](https://docs.teradata.com/r/Teradata-VantageTM-on-AWS-DIY-Installation-and-Administration-Guide/April-2020/Before-Deploying-Vantage-on-AWS-DIY/Security-Groups-and-Ports).

  Based on your network layout, secure VPC connectivity may require changes in Amazon VPC and other networking services. For more information about AWS connectivity, consult [AWS Connectivity Options](https://docs.teradata.com/r/Teradata-VantageCloud-Enterprise/Get-Started/Connecting-Your-Environment/AWS-Connectivity-Options) in the Teradata documentation.

**To configure a AWS Glue Teradata connection:**

1. In your Teradata configuration, identify or create a user and password AWS Glue will connect with, *teradataUser* and *teradataPassword*. For more information, consult [Vantage Security Overview](https://docs.teradata.com/r/Configuring-Teradata-VantageTM-After-Installation/January-2021/Security-Overview/Vantage-Security-Overview) in the Teradata documentation.

1. In AWS Secrets Manager, create a secret using your Teradata credentials. To create a secret in Secrets Manager, follow the tutorial available in [ Create an AWS Secrets Manager secret ](https://docs.aws.amazon.com//secretsmanager/latest/userguide/create_secret.html) in the AWS Secrets Manager documentation. After creating the secret, keep the Secret name, *secretName* for the next step. 
   + When selecting **Key/value pairs**, create a pair for the key `user` with the value *teradataUsername*.
   + When selecting **Key/value pairs**, create a pair for the key `password` with the value *teradataPassword*.

1. In the AWS Glue console, create a connection by following the steps in [Adding an AWS Glue connection](console-connections.md). After creating the connection, keep the connection name, *connectionName*, for the next step. 
   + When selecting a **Connection type**, select Teradata.
   + When providing **JDBC URL**, provide the URL for your instance. You can also hardcode certain comma separated connection parameters in your JDBC URL. The URL must conform to the following format: `jdbc:teradata://teradataHostname/ParameterName=ParameterValue,ParameterName=ParameterValue`

     Supported URL parameters include:
     + `DATABASE`– name of database on host to access by default.
     + `DBS_PORT`– the database port, used when running on a nonstandard port.
   + When selecting a **Credential type**, select **AWS Secrets Manager**, then set **AWS Secret** to *secretName*.

1. In the following situations, you may require additional configuration:
   + 

     For Teradata instances hosted on AWS in an Amazon VPC
     + You will need to provide Amazon VPC connection information to the AWS Glue connection that defines your Teradata security credentials. When creating or updating your connection, set **VPC**, **Subnet** and **Security groups** in **Network options**.

# Creating a Teradata source node


## Prerequisites needed

+ An AWS Glue Teradata Vantage connection, configured with an AWS Secrets Manager secret, as described in the previous section, [Creating a Teradata Vantage connection](creating-teradata-connection.md).
+ Appropriate permissions on your job to read the secret used by the connection.
+ A Teradata table you would like to read from, *tableName*, or query *targetQuery*.

## Adding a Teradata data source


**To add a **Data source – Teradata** node:**

1.  Choose the connection for your Teradata data source. Since you have created it, it should be available in the dropdown. If you need to create a connection, choose **Create a new connection**. For more information see the previous section, [Creating a Teradata Vantage connection](creating-teradata-connection.md). 

    Once you have chosen a connection, you can view the connection properties by clicking **View properties**. 

1.  Choose a **Teradata Source** option: 
   +  **Choose a single table** – access all data from a single table. 
   +  **Enter custom query ** – access a dataset from multiple tables based on your custom query. 

1.  If you chose a single table, enter *tableName*. 

    If you chose **Enter custom query**, enter a SQL SELECT query. 

1.  In **Custom Teradata properties**, enter parameters and values as needed. 

# Creating a Teradata target node


## Prerequisites needed

+ A AWS Glue Teradata Vantage connection, configured with an AWS Secrets Manager secret, as described in the previous section, [Creating a Teradata Vantage connection](creating-teradata-connection.md).
+ Appropriate permissions on your job to read the secret used by the connection.
+ A Teradata table you would like to write to, *tableName*.

## Adding a Teradata data target


**To add a **Data target – Teradata** node:**

1.  Choose the connection for your Teradata data source. Since you have created it, it should be available in the dropdown. If you need to create a connection, choose **Create Teradata connection**. For more information, see [ Overview of using connectors and connections ](https://docs.aws.amazon.com/glue/latest/ug/connectors-chapter.html#using-connectors-overview). 

    Once you have chosen a connection, you can view the connection properties by clicking **View properties**. 

1. Configure **Table name** by providing *tableName*.

1.  In **Custom Teradata properties**, enter parameters and values as needed. 

## Advanced options


You can provide advanced options when creating a Teradata node. These options are the same as those available when programming AWS Glue for Spark scripts.

See [Teradata Vantage connections](aws-glue-programming-etl-connect-teradata-home.md). 

# Connecting to Twilio
Connecting to Twilio

Twilio provides programmable communication tools for making and receiving phone calls, sending, and receiving text messages, and performing other communication functions using its web service APIs. Twilio’s APIs power its platform for communications. Behind these APIs is a software layer connecting and optimizing communications networks around the world to allow your users to call and message anyone, globally. As a Twilio user, you can connect AWS Glue to your Twilio account. Then, you can use Twilio as a data source in your ETL jobs. Run these jobs to transfer data between Twilio and AWS services or other supported applications.

**Topics**
+ [

# AWS Glue support for Twilio
](twilio-support.md)
+ [

# Policies containing the API operations for creating and using connections
](twilio-configuring-iam-permissions.md)
+ [

# Configuring Twilio
](twilio-configuring.md)
+ [

# Configuring Twilio connections
](twilio-configuring-connections.md)
+ [

# Reading from Twilio entities
](twilio-reading-from-entities.md)
+ [

# Twilio connection options
](twilio-connection-options.md)
+ [

# Limitations and notes for Twilio connector
](twilio-connector-limitations.md)

# AWS Glue support for Twilio


AWS Glue supports Twilio as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Twilio.

**Supported as a target?**  
No.

**Supported Twilio API versions**  
The following Twilio API versions are supported:
+ v1
+ 2010-04-01

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring Twilio


Before you can use AWS Glue to transfer data from Twilio, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have a Twilio account with username and password.
+ Your Twilio account is enabled for API access.

If you meet these requirements, you’re ready to connect AWS Glue to your Twilio account. For typical connections, you don't need do anything else in Twilio.

# Configuring Twilio connections


Twilio supports username and password for Basic Authentication. Basic Authentication is a simple authentication method where clients provide credentials directly to access protected resources. AWS Glue is able to use the username (Account SID) and password (Auth Token) to authenticate Twilio APIs.

For public Twilio documentation for Basic Authentication flow, see [Basic Authentication \$1 Twilio](https://www.twilio.com/docs/glossary/what-is-basic-authentication).

To configure a Twilio connection:

1. In AWS Secrets Manager, create a secret with the following details:
   + For Basic Authentication: the Secret should contain the connected app Consumer Secret with the **Account SID** (Username) and **Auth Token** (Password).
**Note**  
You must create a secret for your connections in AWS Glue.

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Connection type**, select Twilio.

   1. Provide the `[Edge\$1Location](https://www.twilio.com/docs/global-infrastructure/edge-locations)` of the Twilio instance you want to connect to.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

# Reading from Twilio entities


**Prerequisite**

A Twilio object you would like to read from. You will need the object name such as `SMS-Message` or `SMS-CountryPricing`.

**Supported entities for source**:


| Entity | Interface | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | --- | 
| SMS-Message | REST | Yes | Yes | No | Yes | Yes | 
| SMS-CountryPricing | REST | No | No | No | Yes | No | 
| Voice-Call | REST | Yes | Yes | No | Yes | No | 
| Voice-Application | REST | Yes | Yes | No | Yes | No | 
| Voice-OutgoingCallerID | REST | Yes | Yes | No | Yes | No | 
| Voice-Queue | REST | Yes | Yes | No | Yes | No | 
| Conversations-Conversation | REST | Yes | Yes | No | Yes | No | 
| Conversations-User | REST | No | Yes | No | Yes | No | 
| Conversations-Role | REST | No | Yes | No | Yes | No | 
| Conversations-Configuration | REST | No | No | No | Yes | No | 
| Conversations-AddressConfiguration | REST | Yes | Yes | No | Yes | No | 
| Conversations-WebhookConfiguration | REST | No | No | No | Yes | No | 
| Conversations-ParticipantConversation | REST | No | No | No | Yes | No | 
| Conversations-Credential | REST | No | Yes | No | Yes | No | 
| Conversations-ConversationService | REST | No | Yes | No | Yes | No | 

**Example**:

```
twilio_read = glueContext.create_dynamic_frame.from_options(
    connection_type="twilio",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "sms-message",
        "API_VERSION": "2010-04-01",
        "Edge_Location": "sydney.us1"
    }
```

**Twilio entity and field details**:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/twilio-reading-from-entities.html)

## Partitioning queries


**Fields supporting partitioning**:

In Twilio, the DateTime datatype fields support field-based partitioning.

You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an **inclusive** lower bound value of the chosen partition field.

  For the Datetime field, we accept the Spark timestamp format used in Spark SQL queries.

  Examples of valid value:

  ```
  "2024-05-01T20:55:02.000Z"
  ```
+ `UPPER_BOUND`: an **exclusive** upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`: the number of partitions.

Example:

```
twilio_read = glueContext.create_dynamic_frame.from_options(
    connection_type="twilio",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "sms-message",
        "API_VERSION": "2010-04-01",
        "PARTITION_FIELD": "date_sent"
        "LOWER_BOUND": "2024-05-01T20:55:02.000Z"
        "UPPER_BOUND": "2024-06-01T20:55:02.000Z"
        "NUM_PARTITIONS": "10"
    }
```

# Twilio connection options


The following are connection options for Twilio:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in Twilio.
+ `EDGE_LOCATION`(String) - (Required) A valid Twilio edge location.
+ `API_VERSION`(String) - (Required) Used for Read. Twilio Rest API version you want to use. Twilio supports two API versions: ‘v1’ and ‘2010-04-01’.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query.
+ `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field.
+ `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for Read.
+ `INSTANCE_URL`(String) - (Required) Used for Read. A valid Twilio instance URL.

# Limitations and notes for Twilio connector


The following are limitations or notes for the Twilio connector:
+ Record-based partitioning is not supported, as there is no provision to retrieve the total count of records from Twilio.
+ The fields `date_sent`, `start_time`, and `end_time` are of the Datetime datatype, but when filtering, they only support date values (time components are not considered).
+ Filtering the "from" or "to" fields works only if the values do not include any prefix (for example, a protocol or label). If a prefix is present, filtering for the respective field does not work. For example if you pass "to": "whatsapp:\$114xxxxxxxxxx" as a filter then Twilio won't return a response. You need to pass it as "to": "\$114xxxxxxxx", then it will return records if they exist.
+ The "identity" field filter is mandatory when querying the `conversation-participant-conversation` entity.

# Connecting to Vertica in AWS Glue Studio
Connecting to Vertica

 AWS Glue provides built-in support for Vertica. AWS Glue Studio provides a visual interface to connect to Vertica, author data integration jobs, and run them on the AWS Glue Studio serverless Spark runtime. 

 AWS Glue Studio creates a unified connection for Vertica. For more information, see [Considerations](using-connectors-unified-connections.md#using-connectors-unified-connections-considerations). 

**Topics**
+ [

# Creating a Vertica connection
](creating-vertica-connection.md)
+ [

# Creating a Vertica source node
](creating-vertica-source-node.md)
+ [

# Creating a Vertica target node
](creating-vertica-target-node.md)
+ [

## Advanced options
](#creating-vertica-connection-advanced-options)

# Creating a Vertica connection


**Prerequisites**:
+ An Amazon S3 bucket or folder to use for temporary storage when reading from and writing to the database, referred to by *tempS3Path*.
**Note**  
When using Vertica in AWS Glue job data previews, temporary files may not be automatically removed from *tempS3Path*. To ensure the removal of temporary files, directly end the data preview session by choosing **End session** in the **Data preview** pane.  
If you cannot guarantee the data preview session is ended directly, consider setting Amazon S3 Lifecycle configuration to remove old data. We recommend removing data older than 49 hours, based on maximum job runtime plus a margin. For more information about configuring Amazon S3 Lifecycle, see [Managing your storage lifecycle](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lifecycle-mgmt.html) in the Amazon S3 documentation.
+ An IAM policy with appropriate permissions to your Amazon S3 path you can associate with your AWS Glue job role.
+ If your Vertica instance is in an Amazon VPC, configure Amazon VPC to allow your AWS Glue job to communicate with the Vertica instance without traffic traversing the public internet. 

  In Amazon VPC, identify or create a **VPC**, **Subnet** and **Security group** that AWS Glue will use while executing the job. Additionally, you need to ensure Amazon VPC is configured to permit network traffic between your Vertica instance and this location. Your job will need to establish a TCP connection with your Vertica client port, (default 5433). Based on your network layout, this may require changes to security group rules, Network ACLs, NAT Gateways and Peering connections.

**To configure a connection to Vertica:**

1. In AWS Secrets Manager, create a secret using your Vertica credentials, *verticaUsername* and *verticaPassword*. To create a secret in Secrets Manager, follow the tutorial available in [ Create an AWS Secrets Manager secret ](https://docs.aws.amazon.com//secretsmanager/latest/userguide/create_secret.html) in the AWS Secrets Manager documentation. After creating the secret, keep the Secret name, *secretName* for the next step. 
   + When selecting **Key/value pairs**, create a pair for the key `user` with the value *verticaUsername*.
   + When selecting **Key/value pairs**, create a pair for the key `password` with the value *verticaPassword*.

1. In the AWS Glue console, create a connection by following the steps in [Adding an AWS Glue connection](console-connections.md). After creating the connection, keep the connection name, *connectionName*, for the next step. 
   + When selecting a **Connection type**, select Vertica.
   + When selecting **Vertica Host**, provide the hostname of your Vertica installation.
   + When selecting **Vertica Port**, the port your Vertica installation is available through.
   + When selecting an **AWS Secret**, provide *secretName*.

1. In the following situations, you may require additional configuration:
   + 

     For Vertica instances hosted on AWS in an Amazon VPC
     + Provide Amazon VPC connection information to the AWS Glue connection that defines your Vertica security credentials. When creating or updating your connection, set **VPC**, **Subnet** and **Security groups** in **Network options**.

You will need to perform the following steps before running your AWS Glue job:
+ Grant the IAM role associated with your AWS Glue job permissions to *tempS3Path*.
+ Grant the IAM role associated with your AWS Glue job permission to read *secretName*.

# Creating a Vertica source node


## Prerequisites needed

+ A Vertica type AWS Glue Data Catalog connection, *connectionName* and a temporary Amazon S3 location, *tempS3Path*, as described in the previous section, [Creating a Vertica connection](creating-vertica-connection.md).
+ A Vertica table you would like to read from, *tableName*, or query *targetQuery*.

## Adding a Vertica data source


**To add a **Data source – Vertica** node:**

1.  Choose the connection for your Vertica data source. Since you have created it, it should be available in the dropdown. If you need to create a connection, choose **Create Vertica connection**. For more information see the previous section, [Creating a Vertica connection](creating-vertica-connection.md). 

    Once you have chosen a connection, you can view the connection properties by clicking **View properties**. 

1. Choose the **Database** containing your table.

1. Choose the **Staging area in Amazon S3**, enter an S3A URI to *tempS3Path*.

1. Choose the **Vertica Source**.
   +  **Choose a single table** – access all data from a single table. 
   +  **Enter custom query ** – access a dataset from multiple tables based on your custom query. 

1.  If you chose a single table, enter *tableName* and optionally select a **Schema**. 

    If you chose **Enter custom query**, enter a SQL SELECT query and optionally select a **Schema**. 

1.  In **Custom Vertica properties**, enter parameters and values as needed. 

# Creating a Vertica target node


## Prerequisites needed

+ A Vertica type AWS Glue Data Catalog connection, *connectionName* and a temporary Amazon S3 location, *tempS3Path*, as described in the previous section, [Creating a Vertica connection](creating-vertica-connection.md).

## Adding a Vertica data target


**To add a **Data target – Vertica** node:**

1.  Choose the connection for your Vertica data source. Since you have created it, it should be available in the dropdown. If you need to create a connection, choose **Create Vertica connection**. For more information see the previous section, [Creating a Vertica connection](creating-vertica-connection.md). 

    Once you have chosen a connection, you can view the connection properties by clicking **View properties**. 

1. Choose the **Database** containing your table.

1. Choose the **Staging area in Amazon S3**, enter an S3A URI to *tempS3Path*.

1. Enter *tableName* and optionally select a **Schema**. 

1.  In **Custom Vertica properties**, enter parameters and values as needed. 

## Advanced options


You can provide advanced options when creating a Vertica node. These options are the same as those available when programming AWS Glue for Spark scripts.

See [Vertica connections](aws-glue-programming-etl-connect-vertica-home.md).

# Connecting to WooCommerce
Connecting to WooCommerce

WooCommerce is an open-source flexible software solution built for WordPress-based websites. It's commonly used to create online e-commerce shops. With this software solution, anyone can turn their regular website into a fully-functioning online store.

**Topics**
+ [

# AWS Glue support for WooCommerce
](woocommerce-support.md)
+ [

# Policies containing the API operations for creating and using connections
](woocommerce-configuring-iam-permissions.md)
+ [

# Configuring WooCommerce
](woocommerce-configuring.md)
+ [

# Configuring WooCommerce connections
](woocommerce-configuring-connections.md)
+ [

# Reading from WooCommerce entities
](woocommerce-reading-from-entities.md)
+ [

# WooCommerce connection options
](woocommerce-connection-options.md)

# AWS Glue support for WooCommerce


AWS Glue supports WooCommerce as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from WooCommerce.

**Supported as a target?**  
No.

**Supported WooCommerce API versions**  
The following WooCommerce API versions are supported:
+ v3

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring WooCommerce


Before you can use AWS Glue to transfer data from WooCommerce, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have a WooCommerce account with a `consumerKey` and a `consumerSecret`.
+ Your WooCommerce account has API access with a valid license.

If you meet these requirements, you’re ready to connect AWS Glue to your WooCommerce account. For typical connections, you don't need do anything else in WooCommerce.

# Configuring WooCommerce connections


WooCommerce supports custom authentication. For public WooCommerce documentation on generating the required API keys for custom authorization, see [Authentication – WooCommerce REST API Documentation](https://woocommerce.github.io/woocommerce-rest-api-docs/#authentication).

To configure a WooCommerce connection:

1. In AWS Secrets Manager, create a secret with the following details:
   + For a customer managed connected app, the Secret should contain the connected app Consumer Secret with `consumerKey` and `consumerSecret` as keys. Note: you must create a secret per connection in AWS Glue.

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Connection type**, select WooCommerce.

   1. Provide the `INSTANCE_URL` of the WooCommerce instance you want to connect to.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

# Reading from WooCommerce entities


**Prerequisite**

A WooCommerce object you would like to read from. You will need the object name such as coupon, order, product, etc.

**Supported entities for source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Coupon | Yes | Yes | Yes | Yes | Yes | 
| Coupon Total | No | No | No | Yes | No | 
| Customers Total | No | No | No | Yes | No | 
| Order | Yes | Yes | Yes | Yes | Yes | 
| Orders Total | No | No | No | Yes | No | 
| Payment Gateway | No | No | No | Yes | No | 
| Product | Yes | Yes | Yes | Yes | Yes | 
| Product attribute | Yes | Yes | Yes | Yes | Yes | 
| Product category | Yes | Yes | Yes | Yes | Yes | 
| Product review | Yes | Yes | Yes | Yes | Yes | 
| Product shipping class | Yes | Yes | Yes | Yes | Yes | 
| Product tag | Yes | Yes | Yes | Yes | Yes | 
| Product variation | Yes | Yes | Yes | Yes | Yes | 
| Products Total | No | No | No | Yes | No | 
| Report (List) | No | No | No | Yes | No | 
| Reviews Total | No | No | No | Yes | No | 
| Sales Report | Yes | No | No | Yes | No | 
| Shipping Method | No | No | No | Yes | No | 
| Shipping Zone | No | No | No | Yes | No | 
| Shipping Zone Location | No | No | No | Yes | No | 
| Shipping Zone Method | No | No | No | Yes | No | 
| Tax Rate | Yes | Yes | Yes | Yes | Yes | 
| Tax Class | No | No | No | Yes | No | 
| Top Sellers Report | Yes | No | No | Yes | No | 

**Example**:

```
woocommerce_read = glueContext.create_dynamic_frame.from_options(
    connection_type="glue.spark.woocommerce",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "coupon",
        "API_VERSION": "v3",
        "INSTANCE_URL": "instanceUrl"
    }
```

**WooCommerce entity and field details**:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/woocommerce-reading-from-entities.html)

**Note**  
Struct and List data types are converted to String data type, and DateTime data type is converted to Timestamp in the response of the connectors.

## Partitioning queries


**Record-based partitioning**:

You can provide the additional Spark option `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.

In record-based partitioning, the total number of records present is queried from the WooCommerce API, and divided by a `NUM_PARTITIONS` number provided. The resulting number of records are then concurrently fetched by each sub-query.
+ `NUM_PARTITIONS`: the number of partitions.

The following entities support record-based partitioning:
+ coupon
+ order
+ product
+ product-attribute
+ product-attribute-term
+ product-category
+ product-review
+ product-shipping-class
+ product-tag
+ product-variation
+ tax-rate

Example:

```
woocommerce_read = glueContext.create_dynamic_frame.from_options(
    connection_type="glue.spark.woocommerce",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "coupon",
        "API_VERSION": "v3",
        "INSTANCE_URL": "instanceUrl"
        "NUM_PARTITIONS": "10"
    }
```

**Record-based partitioning**:

The original query is splitinto `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently:
+ `NUM_PARTITIONS`: the number of partitions.

Example:

```
WooCommerce_read = glueContext.create_dynamic_frame.from_options(
    connection_type="WooCommerce",
    connection_options={
        "connectionName": "connectionName",
        "REALMID": "1234567890123456789",
        "ENTITY_NAME": "Bill",
        "API_VERSION": "v3",
        "NUM_PARTITIONS": "10"
    }
```

# WooCommerce connection options


The following are connection options for WooCommerce:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in WooCommerce.
+ `API_VERSION`(String) - (Required) Used for Read. WooCommerce Rest API version you want to use.
+ `REALM_ID`(String) - An ID that identifies an individual WooCommerce Online company where you send requests.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `INSTANCE_URL`(String) - (Required) A valid WooCommerce instance URL with the format: https://<instance>.wpcomstaging.com
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read.

# Connecting to Zendesk
Connecting to Zendesk

Zendesk is a cloud-based help desk management solution offering customizable tools to build a customer service portal, knowledge base and online communities.

**Topics**
+ [

# AWS Glue support for Zendesk
](zendesk-support.md)
+ [

# Policies containing the API operations for creating and using connections
](zendesk-configuring-iam-permissions.md)
+ [

# Configuring Zendesk
](zendesk-configuring.md)
+ [

# Configuring Zendesk connections
](zendesk-configuring-connections.md)
+ [

# Reading from Zendesk entities
](zendesk-reading-from-entities.md)
+ [

# Zendesk connection options
](zendesk-connection-options.md)
+ [

# Limitations
](zendesk-limitations.md)

# AWS Glue support for Zendesk


AWS Glue supports Zendesk as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Zendesk.

**Supported as a target?**  
No.

**Supported Zendesk API versions**  
The following Zendesk API versions are supported
+ v2

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you dont want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring Zendesk


Before you can use AWS Glue to transfer data from Zendesk, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have a Zendesk account. For more information, see [Creating a Zendesk account](#zendesk-configuring-creating-account).
+ Your Zendesk account is enabled for API access.
+ Your Zendesk account allows you to install connected apps.

If you meet these requirements, you’re ready to connect AWS Glue to your Zendesk account.

## Creating a Zendesk account


To create a Zendesk account:

1. Go to https://www.zendesk.com/in/register/

1. Enter the details such as your work email, first name, last name, phone number, job title, company name, number of employees in company, password and preferred Language. Then choose **Complete trial Signup**.

1. Once your account is created, complete the verification link you received to verify your email address.

1. Once the work email address is verified, you are redirected to your Zendesk account. Choose the **Buy Zendesk option** for your preferred plan. Note: for the Zendesk connector it is recommended to purchase the Suite Enterprise plan.

## Creating a client app and OAuth 2.0 credentials


To create a client app and OAuth 2.0 credentials:

1. Log into your Zendesk account where you want the OAuth 2.0 app to be created https://www.zendesk.com/in/login/

1. Click the gear icon. Choose the **Go to admin center** link to open the admin center page.

1. Choose **Apps and integrations** in the left sidebar, then select **APIs** > **Zendesk API**.

1. On the Zendesk API page, choose the **OAuth Clients** tab.

1. Choose **Add Oauth Client** on the right side.

1. Complete the following fields to create a client:

   1. Client Name - Enter a name for your app. This is the name that users will see when asked to grant access to your application, and when they check the list of third-party apps that have access to their Zendesk.

   1. Description - Optional. A short description of your app that users will see when asked to grant access to it.

   1. Company - Optional. The company name that users will see when asked to grant access to your application. The information can help them understand who they're granting access to.

   1. Logo - Optional. This is the logo that users will see when asked to grant access to your application. The image can be JPG, GIF, or PNG. For best results, upload a square image. It will be resized for the authorization page.

   1. Unique Identifier - The field is auto-populated with a reformatted version of the name you entered for your app. You can change it if you want.

   1. Redirect URLs - Enter the URL or URLs that Zendesk should use to send the user's decision to grant access to your application.

      For example: https://us-east-1.console.aws.amazon.com/gluestudio/oauth

1. Click **Save**.

1. After the page refreshes, a new pre-populated **Secret** field appears on the lower side. This is the "client\$1secret" value specified in the OAuth2 spec. Copy the Secret value to your clipboard and save it somewhere safe. Note: The characters may extend past the width of the text box, so make sure to select everything before copying.

1. Click **Save**.

# Configuring Zendesk connections


The Zendesk connector supports the Authorization Code grant type.
+ This grant type is considered "three-legged" OAuth as it relies on redirecting users to a third-party authorization server to authenticate the user. It is used when creating connections via the AWS Glue console. The user creating a connection may by default rely on an AWS Glue-owned connected app (AWS Glue-managed client application) where they do not need to provide any OAuth related information except for their Zendesk instance URL. The AWS Glue console will redirect the user to Zendesk where the user must login and allow AWS Glue the requested permissions to access their Zendesk instance.
+ You may still opt to create your own connected app in Zendesk and provide your own client ID and client secret when creating connections through the AWS Glue console. In this scenario, you will still be redirected to Zendesk to login and authorize AWS Glue to access your resources.
+ This grant type results in an access token. The access token never expires.

For public Zendesk documentation on creating a connected app for the Authorization Code OAuth flow, see [OAuth Tokens for Grant Types](https://developer.zendesk.com/api-reference/ticketing/oauth/grant_type_tokens/).

To configure a Zendesk connection:

1. In AWS Secrets Manager, create a secret with the following details:

   1. For the AuthorizationCode grant type: for a customer managed connected app the secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key.

   1. Note: You must create a secret per connection in AWS Glue.

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Connection type**, select Zendesk.

   1. Provide the INSTANCE\$1URL of the Zendesk you want to connect to.

   1. Provide the Zendesk environment.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

# Reading from Zendesk entities


**Prerequisite**

A Zendesk Object you would like to read from. You will need the object name such as ticket or user or article, as mentioned in the following table.


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Ticket | Y | Y | Y | Y | N | 
| User | Y | Y | Y | Y | N | 
| Organization | Y | Y | Y | Y | N | 
| Article | Y | Y | N | Y | N | 
| Ticket Event | Y | Y | N | Y | N | 
| Ticket Metric Event | Y | Y | N | Y | N | 
| Ticket Comment | Y | Y | Y | Y | N | 
| Ticket Field | Y | Y | N | Y | N | 
| Ticket Metric | Y | Y | N | Y | N | 
| Ticket Activity | Y | Y | N | Y | N | 
| Ticket Skip | N | Y | N | Y | N | 
| Group | Y | Y | Y | Y | N | 
| Group Membership | N | Y | Y | Y | N | 
| Satisfaction Rating | Y | Y | N | Y | N | 
| View | Y | Y | Y | Y | N | 
| Trigger | Y | Y | Y | Y | N | 
| Trigger Category | N | Y | Y | Y | N | 
| Macro | Y | Y | Y | Y | N | 
| Automation | N | Y | Y | Y | N | 

**Example**:

```
Zendesk_read = glueContext.create_dynamic_frame.from_options(
    connection_type="Zendesk",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "Account",
        "API_VERSION": "v2"
    }
```

**Zendesk entities and field details**:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/zendesk-reading-from-entities.html)

**Note**  
Struct and List data types are converted to String data type in the response of the connector.

## Partitioning queries


Partitions are not supported in Zendesk.

# Zendesk connection options


The following are connection options for Zendesk:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your Object in Zendesk.
+ `API_VERSION`(String) - (Required) Used for Read. Zendesk Rest API version you want to use. For example: v2.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object. For example: id, name, url, created\$1at
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format. For example: group\$1id = 100
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query. For example: "SELECT id,url FROM users WHERE role=\$1"end-user\$1""
+ `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query. Default field is `update_at` for entities supporting the incremental export API (`created_at` for `ticket-events` and `time` for `ticket-metric-events`).
+ `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field.
+ `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field. Optional; this option will be handled by the connector if not provided in the job option. Default value - "2024-05-01T20:55:02.000Z
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read. Optional; this option will be handled by the connector if not provided in the job option. Default value : 1.
+ `IMPORT_DELETED_RECORDS`(String) - Default: FALSE. Used for read. To get the delete records while querying.
+ `ACCESS_TOKEN` - Access token to be used in the request.
+ `INSTANCE_URL` - URL of the instance where the user wants to run the operations. For example : https://\$1subdomain\$1/.zendesk.com

# Limitations


The following are limitations of the Zendesk connector:
+ Offset-based pagination limits the number of pages that can be fetched to 100, but it not recommended as the total number of records that can be fetched is 10,000. However, the cursor-based pagination that is implemented for the Zendesk connector overcomes this limitation. Only the EQUAL\$1TO filter operator is supported through the Zendesk API.

  Because of this limitation, partitioning is not supported for the Zendesk connector.
+ For the "Ticket Event" entity the Rate Limit is 10 requests per minute. While running a AWS Glue ETL job you may receive a 429 (too many requests) error.

# Connecting to Zoho CRM
Connecting to Zoho CRM

Zoho CRM acts as a single repository to bring sales, marketing, and customer support activities together, and streamline process, policy, and people in one platform. Zoho CRM can be easily customized to meet the specific needs of any business type and size.

Zoho CRM's developer platform offers the right mix of low-code and pro-code tools for businesses/enterprises to automate work, integrate data across enterprise stack, and create custom solutions for web and mobile.

**Topics**
+ [

# AWS Glue support for Zoho CRM
](zoho-crm-support.md)
+ [

# Policies containing the API operations for creating and using connections
](zoho-crm-configuring-iam-permissions.md)
+ [

# Configuring Zoho CRM
](zoho-crm-configuring.md)
+ [

# Configuring Zoho CRM connections
](zoho-crm-configuring-connections.md)
+ [

# Reading from Zoho CRM entities
](zoho-crm-reading-from-entities.md)
+ [

# Zoho CRM connection options
](zoho-crm-connection-options.md)
+ [

# Limitations and notes for Zoho CRM connector
](zoho-crm-connector-limitations.md)

# AWS Glue support for Zoho CRM


AWS Glue supports Zoho CRM as follows:

**Supported as a source?**  
Yes – Sync and Async. You can use AWS Glue ETL jobs to query data from Zoho CRM.

**Supported as a target?**  
No.

**Supported Zoho CRM API versions**  
The following Zoho CRM API versions are supported:
+ v7

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring Zoho CRM


Before you can use AWS Glue to transfer data from Zoho CRM, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have a Zoho CRM account.
+ Your Zoho CRM account is enabled for API access.
+ You have a registered API client under the API Console to obtain OAuth Credentials.

# Configuring Zoho CRM connections


The grant type determines how AWS Glue communicates with Zoho CRM to request access to your data. Your choice affects the requirements that you must meet before you create the connection. Zoho CRM supports only the AUTHORIZATION\$1CODE grant type for OAuth 2.0.
+ This grant type is considered "three-legged" OAuth as it relies on redirecting users to a third-party authorization server to authenticate the user. It is used when creating connections via the AWS Glue console. The AWS Glue console will redirect the user to Zoho CRM where the user must login and allow Glue the requested permissions to access their Zoho CRM instance.
+ Users may still opt to create their own connected app in Zoho CRM and provide their own client ID, Auth URL, Token URL, and Instance URL when creating connections through the AWS Glue console. In this scenario, they will still be redirected to Zoho CRM to login and authorize AWS Glue to access their resources.
+ This grant type results in a refresh token and access token. The access token will remain valid for one hour, and may be refreshed automatically without user interaction using the refresh token.
+ For public Zoho CRM documentation on creating a connected app for Authorization Code OAuth flow, see [Authentication](https://www.zoho.com/crm/developer/docs/api/v7/oauth-overview.html).

To configure a Zoho CRM connection:

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Connection type**, select Zoho CRM.

   1. Provide the `INSTANCE_URL` of the Zoho CRM instance you want to connect to.

   1. Provide the user client application client ID.

   1. Select the appropriate **Auth URL** from the dropdown.

   1. Select the appropriate **Token URL** from the dropdown.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

1. In your AWS Glue job configuration, provide `connectionName` as an **Additional network connection**.

# Reading from Zoho CRM entities


**Prerequisite**

Zoho CRM objects you would like to read from. You will need the object name.

**Supported entities for Sync source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Product | Yes | Yes | Yes | Yes | Yes | 
| Quote | Yes | Yes | Yes | Yes | Yes | 
| Purchase Order | Yes | Yes | Yes | Yes | Yes | 
| Solution | Yes | Yes | Yes | Yes | Yes | 
| Call | Yes | Yes | Yes | Yes | Yes | 
| Task | Yes | Yes | Yes | Yes | Yes | 
| Event | Yes | Yes | Yes | Yes | Yes | 
| Invoice | Yes | Yes | Yes | Yes | Yes | 
| Account | Yes | Yes | Yes | Yes | Yes | 
| Contact | Yes | Yes | Yes | Yes | Yes | 
| Vendor | Yes | Yes | Yes | Yes | Yes | 
| Campaign | Yes | Yes | Yes | Yes | Yes | 
| Deal | Yes | Yes | Yes | Yes | Yes | 
| Lead | Yes | Yes | Yes | Yes | Yes | 
| Custom Module | Yes | Yes | Yes | Yes | Yes | 
| Sales Order | Yes | Yes | Yes | Yes | Yes | 
| Price Books | Yes | Yes | Yes | Yes | Yes | 
| Case | Yes | Yes | Yes | Yes | Yes | 

**Example**:

```
zoho_read = glueContext.create_dynamic_frame.from_options(
    connection_type="ZOHO",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "API_VERSION": "v7",
        "INSTANCE_URL": "https://www.zohoapis.in/"
    }
```

**Supported entities for Async source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Product | Yes | No | No | Yes | No | 
| Quote | Yes | No | No | Yes | No | 
| Purchase Order | Yes | No | No | Yes | No | 
| Solution | Yes | No | No | Yes | No | 
| Call | Yes | No | No | Yes | No | 
| Task | Yes | No | No | Yes | No | 
| Event | Yes | No | No | Yes | No | 
| Invoice | Yes | No | No | Yes | No | 
| Account | Yes | No | No | Yes | No | 
| Contact | Yes | No | No | Yes | No | 
| Vendor | Yes | No | No | Yes | No | 
| Campaign | Yes | No | No | Yes | No | 
| Deal | Yes | No | No | Yes | No | 
| Lead | Yes | No | No | Yes | No | 
| Custom Module | Yes | No | No | Yes | No | 
| Sales Order | Yes | No | No | Yes | No | 
| Price Books | Yes | No | No | Yes | No | 
| Case | Yes | No | No | Yes | No | 

**Example**:

```
zoho_read = glueContext.create_dynamic_frame.from_options(
    connection_type="ZOHO",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "API_VERSION": "v7",
        "INSTANCE_URL": "https://www.zohoapis.in/",
        "TRANSFER_MODE": "ASYNC"
    }
```

**Zoho CRM field details**:

Zoho CRM provides endpoints to fetch metadata dynamically for supported entities. Therefore, operator support is captured at the datatype level.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/zoho-crm-reading-from-entities.html)

## Partitioning queries


Partitioning is not supported in Async mode.

**Filter-based partitioning (Sync mode)**:

You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an **inclusive** lower bound value of the chosen partition field.

  For the Datetime field, we accept the Spark timestamp format used in Spark SQL queries.

  Examples of valid value:

  ```
  "2024-09-30T01:01:01.000Z"
  ```
+ `UPPER_BOUND`: an **exclusive** upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`: the number of partitions.

Example:

```
zoho_read = glueContext.create_dynamic_frame.from_options(
    connection_type="zohocrm",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "entityName",
        "API_VERSION": "v7",
        "PARTITION_FIELD": "Created_Time"
        "LOWER_BOUND": "2022-01-01T01:01:01.000Z"
        "UPPER_BOUND": "2024-01-01T01:01:01.000Z"
        "NUM_PARTITIONS": "10"
    }
```

# Zoho CRM connection options


The following are connection options for Zoho CRM:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in Zoho CRM.
+ `API_VERSION`(String) - (Required) Used for Read. Zoho CRM Rest API version you want to use.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query.
+ `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field.
+ `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read.
+ `INSTANCE_URL`(String) - (Required) Used for Read. A valid Zoho CRM instance URL.
+ `TRANSFER_MODE`(String) - Used to indicate whether the query should be run on Async mode.

# Limitations and notes for Zoho CRM connector


The following are limitations or notes for the Zoho CRM connector:
+ With API version v7, you can fetch a maximum of 100,000 records. See the [Zoho documentation](https://www.zoho.com/crm/developer/docs/api/v7/get-records.html) .
+ For the Event entity, the label "Meeting" is displayed as mentioned in the [Zoho documentation](https://www.zoho.com/crm/developer/docs/api/v7/modules-api.html).
+ For Select All functionality:
  + You can fetch a maximum of 50 fields from SaaS for both the GET and POST call.
  + If you want to have data for some specific field that does not belong in the first 50 fields, you will need to manually provide the list of selected fields.
  + If more than 50 fields are selected, any fields beyond the 50 fields will be trimmed and will contain null data in Amazon S3. 
  + In case of a filter expression, if the user-provided list of 50 fields does not include "id" and "Created\$1Time," a custom exception will be raised to prompt the user to include these fields.
+ Filter operators may vary from field-to-field despite of having the same data type. Therefore, you must manually specify a different operator for any field that triggers an error in the SaaS platform. 
+ For Sort By functionality:
  + Data can only be sorted by a single field without a filter expression, whereas data can be sorted by multiple fields when a filter expression is applied.
  + If no sort order is specified for the selected field, the data will be retrieved in ascending order by default. 
+ The supported regions for the Zoho CRM connector are US, Europe, India, Australia and Japan.
+ Async read functionality [Limitations:](https://www.zoho.com/crm/developer/docs/api/v7/bulk-read/limitations.html)
  + Limit order by and partitioning is not supported in the Async mode. 
  + In the Async mode we can transfer data up to 500 pages with 200,000 records per page.
  + For a one-minute interval, only 10 requests are allowed for download. When you exceed the download limit, the system returns an HTTP 429 error and pauses all download requests for one minute before processing can resume.
  + After completing the bulk job, you can access the downloadable file only for a period of one day. After that, you cannot access the file via endpoints.
  + A maximum of 200 select fields can be given via an endpoint. If you specify more than 200 select fields in an endpoint, the system will automatically export all available fields for that module.
  + External fields created in any module are not supported in Bulk Read APIs.
  + Sorting and `Group_by` clauses are not supported via this API endpoint.
  + The values of the fields with sensitive health data will be retrieved only when the **Restrict Data access through API** option in the compliance settings is **disabled**. If the option is enabled, the value will be **empty** in the result.
  + Filtration/Criteria Limits
    + The maximum number of criteria that can be used in a query is 25.
    + Filtration/Criteria on multiline text fields is not supported.

# Connecting to Zoom Meetings
Connecting to Zoom Meetings

Zoom Meetings is a cloud-based video conferencing platform that can be used for video conferencing meetings, audio conferencing, webinars, meeting recordings, and live chat.

**Topics**
+ [

# AWS Glue support for Zoom Meetings
](zoom-meetings-support.md)
+ [

# Policies containing the API operations for creating and using connections
](zoom-meetings-configuring-iam-permissions.md)
+ [

# Configuring Zoom Meetings
](zoom-meetings-configuring.md)
+ [

# Configuring the Zoom Meetings client app
](zoom-meetings-configuring-client-app.md)
+ [

# Configuring Zoom Meetings connections
](zoom-meetings-configuring-connections.md)
+ [

# Reading from Zoom Meetings entities
](zoom-meetings-reading-from-entities.md)
+ [

# Zoom Meetings connection options
](zoom-meetings-connection-options.md)
+ [

# Zoom Meetings limitations
](zoom-meetings-limitations.md)

# AWS Glue support for Zoom Meetings


AWS Glue supports Zoom Meetings as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Zoom Meetings.

**Supported as a target?**  
No.

**Supported Zoom Meetings API versions**  
The following Zoom Meetings API versions are supported:
+ v2

# Policies containing the API operations for creating and using connections
IAM policies

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring Zoom Meetings


Before you can use AWS Glue to transfer data from Zoom Meetings, you must meet these requirements:

## Minimum requirements


The following are minimum requirements:
+ You have a Zoom Meetings account.
+ Your Zoom account is enabled for API access.
+ You have created an OAuth2 app in your Zoom Meetings account. This integration provides the credentials that AWS Glue uses to access your data securely when it makes authenticated calls to your account. For more information, see [Configuring the Zoom Meetings client app](zoom-meetings-configuring-client-app.md).

If you meet these requirements, you’re ready to connect AWS Glue to your Zoom Meetings account. For typical connections, you don't need do anything else in Zoom Meetings.

# Configuring the Zoom Meetings client app


1. Log into the Zoom App Marketplace.

1. Choose **Develop** > **Build App**.

1. Choose **General App** for an OAuth 2.0 based app.

1. On the **Basic Info** page, add or update information about the app such as the app's name, how the app is managed, app credentials, and OAuth information.

1. In the **Select how the app is managed** section, confirm how you want your app to be managed:

   1. **Admin-managed**: Account admins add and manage the app

   1. **User-managed**: Individual users add and manage the app. The app has access to only the user's authorized data.

1. **App Credentials**: The build flow automatically generates app credentials (client ID and client secret) for your app.

1. In the OAuth Information section, set up OAuth for your app.

   1. **OAuth redirect URL** (required): Enter your redirect URL or endpoint to set up OAuth between your app and Zoom.

   1. **Use Strict Mode URL** (optional)

   1. **Subdomain check** (optional)

   1. **OAuth allow lists** (required): Add any unique URLs that Zoom should allow as valid redirects for your OAuth flows.

1. On the **Scopes** page, select the Zoom API methods your app is allowed to call. The scopes define which information and capabilities are available to your user. Select the following granular scopes:
   + user:read:list\$1users:admin
   + zoom\$1rooms:read:list\$1rooms:admin
   + group:read:list\$1members:admin
   + group:read:administrator:admin
   + group:read:list\$1groups:admin
   + report:read:admin
   + role:read:list\$1roles, role:read:list\$1roles:admin

   Once the scopes are added choose **Continue** and the app is ready to use.

For more information about OAuth 2.0 setup see [Integrations (OAuth apps)](https://developers.zoom.us/docs/integrations/).

# Configuring Zoom Meetings connections


Zoom Meetings supports the AUTHORIZATION\$1CODE grant type for OAuth2. The grant type determines how AWS Glue communicates with Zoom Meetings to request access to your data.
+ This grant type is considered "three-legged" OAuth as it relies on redirecting users to a third-party authorization server to authenticate the user. It is used when creating connections via the AWS Glue console. The user creating a connection need to provide OAuth related information like Client ID and Client Secret for their Zoom Meetings client application. The AWS Glue console will redirect the user to Zoom where the user must login and allow AWS Glue the requested permissions to access their Zoom Meetings instance.
+ Users may still opt to create their own connected app in Zoom Meetings and provide their own client ID and client secret when creating connections through the AWS Glue console. In this scenario, they will still be redirected to Zoom Meetings to login and authorize AWS Glue to access their resources.
+ This grant type results in a refresh token and access token. The access token is short lived, and may be refreshed automatically without user interaction using the refresh token.
+ For public Zoom Meetings documentation on creating a connected app for Authorization Code OAuth flow, see [Using OAuth 2.0](https://developers.zoom.us/docs/api/using-zoom-apis/#using-oauth-20).

To configure a Zoom Meetings connection:

1. In AWS Secrets Manager, create a secret with the following details:

   1. For the customer managed connected app, the Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key.

   1. Note: you must create a secret for your connections in AWS Glue.

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Connection type**, select Zoom Meetings.

   1. Provide the Zoom Meetings environment you want to connect to.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

# Reading from Zoom Meetings entities


**Prerequisite**

A Zoom Meetings object you would like to read from. You will need the object namem such as `Group` or `Zoom Rooms`.

**Supported entities for source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Zoom Rooms | No | Yes | No | Yes | No | 
| Group | No | No | No | Yes | No | 
| Group Member | Yes | Yes | No | Yes | No | 
| Group Admin | No | Yes | No | Yes | No | 
| Report (daily) | Yes | No | No | Yes | No | 
| Roles | No | No | No | Yes | No | 
| Users | Yes | Yes | No | Yes | No | 

**Example**:

```
zoom_read = glueContext.create_dynamic_frame.from_options(
    connection_type="zoom",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "organization",
        "API_VERSION": "v2"
    }
)
```

**Zoom Meetings entity and field details**:

Zoom Meetings dynamically loads the available fields under the selected entity. Depending on the data type of the field, it supports the following filter operators.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/zoom-meetings-reading-from-entities.html)

## Partitioning queries


Zoom Meetings doesn't support filter-based partitioning or record-based partitioning.

# Zoom Meetings connection options


The following are connection options for Zoom Meetings:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of the Zoom Meetings entity. For example, `group`.
+ `API_VERSION`(String) - (Required) Used for Read. Zoom Meetings Rest API version you want to use. The value will be `v2`, as Zoom Meetings currently supports only version v2.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. A comma-separated list of columns you want to select for the selected entity.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.

# Zoom Meetings limitations


The following are limitations or notes for Zoom Meetings:
+ Zoom Meetings does not support orderby.
+ Zoom Meetings does not support filter-based partitioning because there is no field that can satisfy the required criteria.
+ Zoom Meetings does not support record-based partitioning because the pagination limit and offset-based pagination is not supported.

# Adding a JDBC connection using your own JDBC drivers


 You can use your own JDBC driver when using a JDBC connection. When the default driver utilized by the AWS Glue crawler is unable to connect to a database, you can use your own JDBC Driver. For example, if you want to use SHA-256 with your Postgres database, and older postgres drivers do not support this, you can use your own JDBC driver. 

## Supported datasources



| Supported datasources | Unsupported datasources | 
| --- | --- | 
| MySQL | Snowflake | 
| Postgres |  | 
| Oracle |  | 
| Redshift |  | 
| SQL Server |  | 
| Aurora\$1 |  | 

 \$1Supported if the native JDBC driver is being used. Not all driver features can be leveraged. 

## Adding a JDBC driver to a JDBC connection


**Note**  
 If you choose to bring in your own JDBC driver versions, AWS Glue crawlers will consume resources in AWS Glue jobs and Amazon S3 buckets to ensure your provided driver are run in your environment. The additional usage of resources will be reflected in your account. The cost for AWS Glue crawlers and jobs is under the AWS Glue category in billing. Additionally, providing your own JDBC driver does not mean that the crawler is able to leverage all of the driver's features. 

**To add your own JDBC driver to a JDBC connection:**

1.  Add the JDBC driver file to an Amazon S3 location. You can create a bucket and/or folder or use an existing bucket and/or folder. 

1.  In the AWS Glue console, choose **Connections** in the left-hand menu under **Data Catalog**, then create a new connection. 

1.  Complete the fields for **Connection properties** and choose JDBC for **Connection type**. 

1.  In **Connection access**, enter the **JDBC URL** and **JDBC Driver Class name** – *optional*. The driver class name must be for a datasource supported by AWS Glue crawlers.   
![\[The screenshot shows a data source with JDBC selected and a connection in the Add data source window.\]](http://docs.aws.amazon.com/glue/latest/dg/images/add-connection-connection-access.png)

1.  Choose the Amazon S3 path where the JDBC driver is located in the **JDBC Driver Amazon S3 Path** – *optional* field. 

1.  Complete the fields for Credential type if entering a username and password or secret. When complete, choose **Create connection**. 
**Note**  
 Testing connection is not supported currently. When crawling the data source with a JDBC driver you provided, the crawler skips this step. 

1.  Add the newly created connection to a crawler. In the AWS Glue console, choose **Crawlers** in the left-hand menu under **Data Catalog**, then create a new crawler. 

1.  In the **Add crawler** wizard, in Step 2 choose **Add a data source**.   
![\[The screenshot shows a data source with JDBC selected and a connection in the Add data source window.\]](http://docs.aws.amazon.com/glue/latest/dg/images/add-crawler-add-data-source.png)

1.  Choose **JDBC** as the data source and choose the the connection that was created in the previous steps. Complete 

1.  In order to use your own JDBC driver with a AWS Glue crawler, add the following permissions to the role used by the crawler:
   +  Grant permissions for the following job actions: `CreateJob`, `DeleteJob`, `GetJob`, `GetJobRun`, `StartJobRun`. 
   +  Grant permissions for IAM actions: `iam:PassRole` 
   +  Grant permissions for Amazon S3 actions: `s3:DeleteObjects`, `s3:GetObject`, `s3:ListBucket`, `s3:PutObject`. 
   +  Grant service principal access to bucket/folder in the IAM policy. 

    Example IAM policy: 

------
#### [ JSON ]

****  

   ```
   {
     "Version":"2012-10-17",		 	 	 
     "Statement": [
       {
         "Sid": "VisualEditor0",
         "Effect": "Allow",
         "Action": [
           "s3:PutObject",
           "s3:GetObject",
           "s3:ListBucket",
           "s3:DeleteObject"
         ],
         "Resource": [
           "arn:aws:s3:::amzn-s3-demo-bucket/driver-parent-folder/driver.jar",
           "arn:aws:s3:::amzn-s3-demo-bucket"
         ]
       }
     ]
   }
   ```

------

    The AWS Glue crawler creates two folders: \$1glue\$1job\$1crawler and \$1crawler.

   If the driver jar is located in the `s3://amzn-s3-demo-bucket/driver.jar"` folder, add the following resources: 

   ```
   "Resource": [
                   "arn:aws:s3:::amzn-s3-demo-bucket/_glue_job_crawler/*",
     		 "arn:aws:s3:::amzn-s3-demo-bucket/_crawler/*"
               ]
   ```

   If the driver jar is located in the `s3://amzn-s3-demo-bucket/tmp/driver/subfolder/driver.jar"`folder, add the following resources: 

   ```
   "Resource": [
                  "arn:aws:s3:::amzn-s3-demo-bucket/tmp/_glue_job_crawler/*",
     		"arn:aws:s3:::amzn-s3-demo-bucket/tmp/_crawler/*"
               ]
   ```

1.  If you are using a VPC, you must allow access to the AWS Glue endpoint by creating the interface endpoint and add it to your route table. For more information, see [ Creating an interface VPC endpoint for AWS Glue](https://docs.aws.amazon.com/glue/latest/dg/vpc-interface-endpoints.html#vpc-endpoint-create) 

1.  If you are using encryption in your Data Catalog, create the AWS KMS interface endpoint and add it to your route table. For more information, see [ Creating a VPC endpoint for AWS KMS](https://docs.aws.amazon.com/kms/latest/developerguide/kms-vpc-endpoint.html#vpce-create-endpoint). 