

# Connecting to Mixpanel
<a name="connecting-to-mixpanel"></a>

Mixpanel is a powerful real-time analytics platform that helps companies measure and optimize user engagement. Mixpanel is an app used for tracking customer behavior. It enables you to track how users engage with your product and analyze this data with interactive reports that let you query and visualize the results with just a few clicks. As a Mixpanel user, you can connect AWS Glue to your Mixpanel account. Then, you can use Mixpanel as a data source in your ETL jobs. Run these jobs to transfer data between Mixpanel and AWS services or other supported applications.

**Topics**
+ [

# AWS Glue support for Mixpanel
](Mixpanel-support.md)
+ [

# Policies containing the API operations for creating and using connections
](mixpanel-configuring-iam-permissions.md)
+ [

# Configuring Mixpanel
](mixpanel-configuring.md)
+ [

# Configuring Mixpanel connections
](mixpanel-configuring-connections.md)
+ [

# Reading from Mixpanel entities
](mixpanel-reading-from-entities.md)
+ [

# Mixpanel connection options
](mixpanel-connection-options.md)
+ [

# Creating a Mixpanel account and configuring the client app
](mixpanel-create-account.md)
+ [

# Limitations
](mixpanel-connector-limitations.md)

# AWS Glue support for Mixpanel
<a name="Mixpanel-support"></a>

AWS Glue supports Mixpanel as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from Mixpanel.

**Supported as a target?**  
No.

**Supported Mixpanel API versions**  
 2.0 

# Policies containing the API operations for creating and using connections
<a name="mixpanel-configuring-iam-permissions"></a>

 The following sample policy describes the required AWS permissions for creating and using connections. If you are creating a new role, create a policy that contains the following: 

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the preceding method, alternatively, use the following managed IAM policies:
+  [ AWSGlueServiceRole ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints. 
+  [ AWSGlueConsoleFullAccess ](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console. 

# Configuring Mixpanel
<a name="mixpanel-configuring"></a>

Before you can use AWS Glue to transfer from Mixpanel, you must meet these requirements:

## Minimum requirements
<a name="mixpanel-configuring-min-requirements"></a>
+  You have a Mixpanel account. For more information about creating an account, see [Creating a Mixpanel account](mixpanel-create-account.md). 
+  Your Mixpanel account is enabled for API access. API access is enabled by default for the Enterprise, Unlimited, Developer, and Performance editions. 

If you meet these requirements, you’re ready to connect AWS Glue to your Mixpanel account. For typical connections, you don't need do anything else in Mixpanel.

# Configuring Mixpanel connections
<a name="mixpanel-configuring-connections"></a>

Mixpanel supports username and password for `BasicAuth`. Basic Authentication is a simple authentication method where clients provide credentials directly to access protected resources. AWS Glue is able to use the username and password to authenticate Mixpanel APIs. 

For public Mixpanel documentation about `BasicAuth` flow, see [ Mixpanel Service Accounts ](https://developer.mixpanel.com/reference/service-accounts). 

To configure a Mixpanel connection:

1. In AWS Secrets Manager, create a secret with the following details: 
   +  For Basic Authentication, Secret should contain the connected app Consumer Secret with `USERNAME` and `PASSWORD` as key. 
**Note**  
It is a must to create a secret per connection in AWS Glue.

1. In the AWS Glue Studio, create a connection under **Data Connections** by following the steps below: 

   1. When selecting a **Connection type**, select **Mixpanel**.

   1. Provide the `INSTANCE_URL` of the Mixpanel that you want to connect to.

   1. Select the IAM role for which AWS Glue can assume and has permissions for the following actions: 

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1.  Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens. 

   1.  Select **Network options** if you want to use your network. 

1.  Grant the IAM role associated with your AWS Glue job permission to read `secretName`. 

# Reading from Mixpanel entities
<a name="mixpanel-reading-from-entities"></a>

 **Prerequisites** 

You must have a Mixpanel object, such as `Funnels`, `Retention`, or `Retention Funnels`, from which you would like to read data. Additionally, you will need to know the object name.

 **Supported entities** 


| Entity | Can be Filtered | Supports Limit | Supports Order By | Supports Select \$1 | Supports Partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Funnels | Yes | No | No | Yes | No | 
| Retention | Yes | No | No | Yes | No | 
| Segmentation | Yes | No | No | Yes | No | 
| Segmentation Sum | Yes | No | No | Yes | No | 
| Segmentation Average | Yes | No | No | Yes | No | 
| Cohorts | Yes | No | No | Yes | No | 
| Engage | No | Yes | No | Yes | No | 
| Events | Yes | No | No | Yes | No | 
| Events Top | Yes | No | No | Yes | No | 
| Events Names | Yes | No | No | Yes | No | 
| Events Properties | Yes | No | No | Yes | No | 
| Events Properties Top | Yes | No | No | Yes | No | 
| Events Properties Values | Yes | No | No | Yes | No | 
| Annotations | Yes | No | No | Yes | No | 
| Profile Event Activity | Yes | No | No | Yes | No | 

 **Example** 

```
mixpanel_read = glueContext.create_dynamic_frame.from_options(
    connection_type="mixpanel",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "/cohorts/list?project_id=2603353",
        "API_VERSION": "2.0",
        "INSTANCE_URL": "https://www.mixpanel.com/api/app/me"
    }
```

 **Mixpanel entity and field details** 

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/mixpanel-reading-from-entities.html)

# Mixpanel connection options
<a name="mixpanel-connection-options"></a>

The following are connection options for Mixpanel:
+  `ENTITY_NAME` (String) – (Required) Used for Read/Write. The name of your Object in Mixpanel. 
+  `API_VERSION` (String) – (Required) Used for Read/Write. Mixpanel Rest API version you want to use. For example: v2.0. 
+  `SELECTED_FIELDS`(List<String>) – Default: empty (SELECT \$1). Used for Read. Columns you want to select for the object. 
+  `FILTER_PREDICATE`(String) – Default: empty. Used for Read. It should be in the Spark SQL format. 
+  `QUERY`(String) – Default: empty. Used for Read. Full Spark SQL query. 

# Creating a Mixpanel account and configuring the client app
<a name="mixpanel-create-account"></a>

**Creating a Mixpanel account**

1. Navigate to the [Mixpanel home page/](https://mixpanel.com/). 

1. On the **Mixpanel** home page, choose **Sign Up** at the upper-right corner of the page. 

1. On the **Let's get started** page, complete the following actions: 
   + Enter your email address in the designated field.
   + Select the required checkbox to agree to the terms.
   + Choose **Get Started** to proceed.

     Upon successful completion, you will receive a verification email. 

1. Check your email inbox for a verification message, open the email, and follow the instructions to verify your email address. 

1. On the verification page, choose **Verify Email ** to complete your email verification. 

1. On the **Name Your Organization** page, enter your organization name and choose **Next**. 

1. On the **Your First Project** page, enter your project details and choose **Create**.

1. On the next page, choose **Let's get Started** to complete the creation of your account. 

**Logging into a Mixpanel account**

1. Navigate to the [Mixpanel login page/](https://mixpanel.com/login/). 

1. Enter your email address and choose **Continue**. 

1. Check your email inbox for a verification message, open the email, and follow the instructions to verify your email address. 

1. On the next page, choose **Log In button** to log in to your account. 

**Purchasing a Mixpanel plan**

1. On the Mixpanel page, select the **Settings** icon located in the upper-right corner of the page.

1. From the list of options, select **Plan Details and Billing**. 

1. On the **Plan Details and Billing** page, select **Upgrade or Modify**.

1. In the next page, select the plan that you want to purchase.

   This completes the account creation and plan purchasing process.

**Creating a username and client secret (To register your app)**

1. On the Mixpanel page, select the **Settings** icon located in the upper-right corner of the page." 

1. From the list of options, select **Project Settings**. 

1. On the **Project Settings** page, select **Service Accounts** and then select **Add Service Account**.

1. From the **Service Account** dropdown list, select the **service Account or enter name to create**, add **Project Role**, specify **expires**, and select **Add**. 
**Important**  
After completing the previous step, the following page displays the service account's secret key. Ensure to save the service account's secret key. You will not be able to access it again after this point.

# Limitations
<a name="mixpanel-connector-limitations"></a>

The following are limitations for the Mixpanel connector:
+ For `Segmentation Numeric` entity, the Mixpanel API throws a `400 (Bad Request)` error if no numeric data is found for the mandatory filters. We are treating this as an `OK` response to prevent flow failure.
+ The queryable field `limit` has been removed from the supported entities because:
  + It was causing errors due to being interpreted as the SDK's limit feature
  + The filter served no practical purpose
  + Equivalent functionality is now covered by the limit feature implementation
+ Field-based partitioning cannot be supported due to the absence of required operators (`>=`, `<=`, `<`, `>`, `between`) for partitioning from the SaaS platform. Although it supports the `between` operator, the fields for which it supports this operator are non-retrievable. Hence, the criteria for field-based partitioning are not satisfied.
+  As there is no provision to get an 'offset' value for entities that support pagination, it is not possible to support record-based partitioning for Mixpanel.
+ `Cohorts` entity only supports `CreatedDate/Time` field and there is no field to identify `UpdatedDate/Time` as a result `DML_Status` cannot be identified. Also, there is no endpoint to identify deleted records. Hence, CDC cannot be supported.
+  To run a AWS Glue job for the entities mentioned below, mandatory filters are required. Refer to the table below for entity names and their required filters.  
**Entity name and required filters**    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/mixpanel-connector-limitations.html)