

# Connecting to Pendo
<a name="connecting-to-pendo"></a>

Pendo provides a rich data store for user interaction data. Customers will transfer this data to AWS so that they may join it with other product data, perform additional analysis and dash-boarding and set alerts if they choose.

**Topics**
+ [AWS Glue support for Pendo](pendo-support.md)
+ [Policies containing the API operations for creating and using connections](pendo-configuring-iam-permissions.md)
+ [Configuring Pendo](pendo-configuring.md)
+ [Configuring Pendo connections](pendo-configuring-connections.md)
+ [Reading from Pendo entities](pendo-reading-from-entities.md)
+ [Pendo connection options](pendo-connection-options.md)
+ [Limitations](pendo-connector-limitations.md)

# AWS Glue support for Pendo
<a name="pendo-support"></a>

AWS Glue supports Pendo as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Pendo.

**Supported as a target?**  
No.

**Supported Pendo API versions**  
 v1 

# Policies containing the API operations for creating and using connections
<a name="pendo-configuring-iam-permissions"></a>

The following sample policy describes the required AWS permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the preceding method, alternatively, use the following managed IAM policies:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring Pendo
<a name="pendo-configuring"></a>

Before you can use AWS Glue to transfer from Pendo, you must meet the following requirements:

## Minimum requirements
<a name="pendo-configuring-min-requirements"></a>
+ You have a Pendo account with an `apiKey` with `write access` enabled.
+  Your Pendo account has API access with a valid license. 

If you meet these requirements, you’re ready to connect AWS Glue to your Pendo account. For typical connections, you don't need do anything else in Pendo.

# Configuring Pendo connections
<a name="pendo-configuring-connections"></a>

Pendo supports custom authentication.

For public Pendo documentation on generating the required API keys for custom authorization, refer [Authentication – Pendo REST API Documentation](https://engageapi.pendo.io/?bash#getting-started) 

To configure a Pendo connection:

1. In AWS Secrets Manager, create a secret with the following details: 
   + For customer managed connected app - Secret should contain the connected app Consumer Secret with `apiKey` as the key. 
**Note**  
It is a must to create a secret per connection in AWS Glue.

1. In AWS Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Data Source**, select Pendo.

   1. Provide the `instanceUrl` of the Pendo instance you want to connect to.

   1. Select the IAM role for which AWS Glue can assume and has permissions for following actions: 

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. 

   1.  Select the network options if you want to use your network. 

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`. 

1. In your AWS Glue job configuration, provide `connectionName` as an Additional network connection.

# Reading from Pendo entities
<a name="pendo-reading-from-entities"></a>

 **Prerequisites** 

An Pendo Object you would like to read from. Refer the supported entities table below to check the available entities. 

 **Supported entities** 
+ [Feature](https://developers.pendo.io/docs/?bash#feature)
+ [Guide](https://developers.pendo.io/docs/?bash#guide)
+ [Page](https://developers.pendo.io/docs/?bash#page)
+ [Report](https://developers.pendo.io/docs/?bash#report)
+ [Report Data](https://developers.pendo.io/docs/?bash#return-report-contents-as-array-of-json-objects)
+ [Visitor](https://developers.pendo.io/docs/?bash#visitor)
+ [Account](https://developers.pendo.io/docs/?bash#entities)
+ [Event](https://developers.pendo.io/docs/?bash#events-grouped)
+ [Feature Event](https://developers.pendo.io/docs/?bash#events-grouped)
+ [Guide Event](https://developers.pendo.io/docs/?bash#events-ungrouped)
+ [Page Event](https://developers.pendo.io/docs/?bash#events-grouped)
+ [Poll Event ](https://developers.pendo.io/docs/?bash#events-ungrouped)
+ [Track Event](https://developers.pendo.io/docs/?bash#events-grouped)


| Entity | Can be Filtered | Supports Limit | Supports Order By | Supports Select \$1 | Supports Partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Feature | No | No | No | Yes | No | 
| Guide | No | No | No | Yes | No | 
| Page | No | No | No | Yes | No | 
| Report | No | No | No | Yes | No | 
| Report Data | No | No | No | Yes | No | 
| Visitor (Aggregation API) | Yes | No | Yes | Yes | No | 
| Account (Aggregation API) | Yes | No | Yes | Yes | No | 
| Event (Aggregation API) | Yes | No | Yes | Yes | No | 
| Feature Event (Aggregation API) | Yes | No | Yes | Yes | Yes | 
| Guide Event (Aggregation API) | Yes | No | Yes | Yes | Yes | 
| Account (Aggregation API) | Yes | No | Yes | Yes | Yes | 
| Page Event (Aggregation API) | Yes | No | Yes | Yes | Yes | 
| Poll Event (Aggregation API) | Yes | No | Yes | Yes | Yes | 
| Track Event (Aggregation API) | Yes | No | Yes | Yes | Yes | 

 **Example** 

```
Pendo_read = glueContext.create_dynamic_frame.from_options(
    connection_type="glue.spark.Pendo",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "feature",
        "API_VERSION": "v1",
        "INSTANCE_URL": "instanceUrl"
    }
```

## Partitioning queries
<a name="adobe-marketo-engage-reading-partitioning-queries"></a>

You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an **inclusive** lower bound value of the chosen partition field.

  For the DateTime field, we accept the value in ISO format.

  Example of valid value:

  ```
  "2024-07-01T00:00:00.000Z"
  ```
+ `UPPER_BOUND`: an **exclusive** upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`: the number of partitions.

The following table describes the entity partitioning field support details:


| Entity name | 
| --- | 
| Event | 
|  Feature Event  | 
| Guide Event | 
| Page Event | 
| Poll Event | 
| Track Event | 

Example:

```
pendo_read = glueContext.create_dynamic_frame.from_options(
    connection_type="glue.spark.pendo",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "event",
        "API_VERSION": "v1",
        "INSTANCE_URL": "instanceUrl"
        "NUM_PARTITIONS": "10",
        "PARTITION_FIELD": "appId"
        "LOWER_BOUND": "4656"
        "UPPER_BOUND": "7788"
    }
```

# Pendo connection options
<a name="pendo-connection-options"></a>

The following are connection options for Pendo:
+  `ENTITY_NAME`(String) – (Required) Used for Read/Write. The name of your Object in Pendo. 
+ `INSTANCE_URL`(String) - (Required) A valid Pendo Instance URL with the following allowed values:
  + [Default](https://app.pendo.io/)
  + [Europe](https://app.eu.pendo.io/)
  + [US1](https://us1.app.pendo.io/)
+ `API_VERSION`(String) - (Required) Used for Read. Pendo Engage Rest API version you want to use. For example: 3.0.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query.
+ `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field.
+ `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field. 
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read.

# Limitations
<a name="pendo-connector-limitations"></a>

The following are limitations for the Pendo connector:
+ Pagination is not supported in Pendo.
+ Filtration is supported only by the Aggregate API objects(`Account`, `Event`, `Feature Event`, `Guide Events`, `Page Event`, `Poll Event`, `Track Event`, and `Visitor`)
+ DateTimeRange is mandatory filter parameter for Aggregate API objects (`Event`, `Feature Event`, `Guide Events`, `Page Event`, `Poll Event,` `Track Event`)
+ The dayRange period will be rounded down to the start of the period in the time zone. For example, if provided filter is `2023-01-12T07:55:27.065Z` then this time period will be rounded to the start of period, that is `2023-01-12T00:00:00Z` . 