

# Connecting to PayPal
<a name="connecting-to-data-paypal"></a>

PayPal is a payments system that facilitates online money transfers between parties, such as transfers between customers and online vendors. If you're a PayPal user, your account contains data about your transactions, such as their payers, dates, and statuses. You can use AWS Glue to transfer data from PayPal to certain AWS services or other supported applications.

**Topics**
+ [AWS Glue support for PayPal](paypal-support.md)
+ [Policies containing the API operations for creating and using connections](paypal-configuring-iam-permissions.md)
+ [Configuring PayPal](paypal-configuring.md)
+ [Configuring PayPal connections](paypal-configuring-connections.md)
+ [Reading from PayPal entities](paypal-reading-from-entities.md)
+ [PayPal connection options](paypal-connection-options.md)
+ [Limitations and notes for PayPal connector](paypal-connector-limitations.md)

# AWS Glue support for PayPal
<a name="paypal-support"></a>

AWS Glue supports PayPal as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from PayPal.

**Supported as a target?**  
No.

**Supported PayPal API versions**  
The following PayPal API versions are supported:
+ v1

# Policies containing the API operations for creating and using connections
<a name="paypal-configuring-iam-permissions"></a>

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring PayPal
<a name="paypal-configuring"></a>

Before you can use AWS Glue to transfer data from PayPal, you must meet these requirements:

## Minimum requirements
<a name="paypal-configuring-min-requirements"></a>

The following are minimum requirements:
+ You have a PayPal account with client credentials.
+ Your PayPal account has API access with a valid license.

If you meet these requirements, you’re ready to connect AWS Glue to your PayPal account. For typical connections, you don't need do anything else in PayPal.

# Configuring PayPal connections
<a name="paypal-configuring-connections"></a>

PayPal supports the CLIENT CREDENTIALS grant type for OAuth2.
+ This grant type is considered 2-legged OAuth 2.0 as it is used by clients to obtain an access token outside of the context of a user. AWS Glue is able to use the client ID and client secret to authenticate the PayPal APIs which are provided by custom services that you define.
+ Each custom service is owned by an API-only user which has a set of roles and permissions which authorize the service to perform specific actions. An access token is associated with a single custom service.
+ This grant type results in an access token which is short lived, and may be renewed by calling the `/v2/oauth2/token` endpoint again.
+ For public PayPal documentation for OAuth 2.0 with client credentials, see [Authentication](https://developer.paypal.com/api/rest/authentication/).

To configure a PayPal connection:

1. In AWS Secrets Manager, create a secret with the following details:

   1. For the customer managed connected app, the Secret should contain the connected app Consumer Secret with `USER_MANAGED_CLIENT_APPLICATION_CLIENT_SECRET` as key.

   1. Note: you must create a secret for your connections in AWS Glue.

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Connection type**, select PayPal.

   1. Provide the `INSTANCE_URL` of the PayPal instance you want to connect to.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

## Getting OAuth 2.0 credentials
<a name="paypal-getting-oauth-20-credentials"></a>

To call the Rest API, you'll need to exchange your client ID and client secret for an access token. For more information, see [Get started with PayPal REST APIs](https://developer.paypal.com/api/rest/) .

# Reading from PayPal entities
<a name="paypal-reading-from-entities"></a>

**Prerequisite**

A PayPal object you would like to read from. You will need the object name, `transaction`.

**Supported entities for source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| transaction | Yes | Yes | No | Yes | Yes | 

**Example**:

```
paypal_read = glueContext.create_dynamic_frame.from_options(
    connection_type="paypal",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "transaction",
        "API_VERSION": "v1",
        "INSTANCE_URL": "https://api-m.paypal.com"
    }
```

**PayPal entity and field details**:

Entities with static metadata:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/paypal-reading-from-entities.html)

## Partitioning queries
<a name="paypal-reading-partitioning-queries"></a>

You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an **inclusive** lower bound value of the chosen partition field.

  For the Datetime field, we accept the value in ISO format.

  Examples of valid value:

  ```
  "2024-07-01T00:00:00.000Z"
  ```
+ `UPPER_BOUND`: an **exclusive** upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`: the number of partitions.

The following field is supported for entity-wise partitioning:


| Entity name | Partitioning fields | Data type | 
| --- | --- | --- | 
| transaction | transaction\$1initiation\$1date | DateTime | 

Example:

```
paypal_read = glueContext.create_dynamic_frame.from_options(
    connection_type="paypal",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "transaction",
        "API_VERSION": "v1",
        "PARTITION_FIELD": "transaction_initiation_date"
        "LOWER_BOUND": "2024-07-01T00:00:00.000Z"
        "UPPER_BOUND": "2024-07-02T00:00:00.000Z"
        "NUM_PARTITIONS": "10"
    }
```

# PayPal connection options
<a name="paypal-connection-options"></a>

The following are connection options for PayPal:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in PayPal.
+ `API_VERSION`(String) - (Required) Used for Read. PayPal Rest API version you want to use.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `PARTITION_FIELD`(String) - Used for Read. Field to be used to partition query.
+ `LOWER_BOUND`(String)- Used for Read. An inclusive lower bound value of the chosen partition field.
+ `UPPER_BOUND`(String) - Used for Read. An exclusive upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read.

# Limitations and notes for PayPal connector
<a name="paypal-connector-limitations"></a>

The following are limitations or notes for the PayPal connector:
+ The [PayPal transactions documentation](https://developer.paypal.com/docs/api/transaction-search/v1/#search_get) mentions that it takes a maximum of three hours for executed transactions to appear in the list transactions call. However, it has been observed to take more time than that depending the [https://developer.paypal.com/docs/api/transaction-search/v1/#search_get:~:text=last_refreshed_datetime](https://developer.paypal.com/docs/api/transaction-search/v1/#search_get:~:text=last_refreshed_datetime). Here, `last_refreshed_datetime` is the amount of time until which you have data available from the APIs.
+ If the `last_refreshed_datetime` is less than the requested `end_date` then, the `end_date` becomes equal to the `last_refreshed_datetime` as we only have data up until that point.
+ The `transaction_initiation_date` field is a mandatory filter to be provided for the `transaction` entity and the [maximum supported](https://developer.paypal.com/docs/transaction-search/#:~:text=The%20maximum%20supported%20date%20range%20is%2031%20days.) date range for this field is 31 days.
+ When you call the `transaction` entity API request with filters (query parameters) other than the `transaction_initiation_date` field, it is expected that the value of the [https://developer.paypal.com/docs/api/transaction-search/v1/#search_get:~:text=If%20you%20specify%20one%20or%20more%20optional%20query%20parameters%2C%20the%20ending_balance%20response%20field%20is%20empty.](https://developer.paypal.com/docs/api/transaction-search/v1/#search_get:~:text=If%20you%20specify%20one%20or%20more%20optional%20query%20parameters%2C%20the%20ending_balance%20response%20field%20is%20empty.) field won’t be fetched in the response.