

# Connecting to WooCommerce
<a name="connecting-to-data-woocommerce"></a>

WooCommerce is an open-source flexible software solution built for WordPress-based websites. It's commonly used to create online e-commerce shops. With this software solution, anyone can turn their regular website into a fully-functioning online store.

**Topics**
+ [AWS Glue support for WooCommerce](woocommerce-support.md)
+ [Policies containing the API operations for creating and using connections](woocommerce-configuring-iam-permissions.md)
+ [Configuring WooCommerce](woocommerce-configuring.md)
+ [Configuring WooCommerce connections](woocommerce-configuring-connections.md)
+ [Reading from WooCommerce entities](woocommerce-reading-from-entities.md)
+ [WooCommerce connection options](woocommerce-connection-options.md)

# AWS Glue support for WooCommerce
<a name="woocommerce-support"></a>

AWS Glue supports WooCommerce as follows:

**Supported as a source?**  
Yes. You can use AWS Glue ETL jobs to query data from WooCommerce.

**Supported as a target?**  
No.

**Supported WooCommerce API versions**  
The following WooCommerce API versions are supported:
+ v3

# Policies containing the API operations for creating and using connections
<a name="woocommerce-configuring-iam-permissions"></a>

The following sample policy describes the required AWS IAM permissions for creating and using connections. If you are creating a new role, create a policy that contains the following:

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "glue:ListConnectionTypes",
        "glue:DescribeConnectionType",
        "glue:RefreshOAuth2Tokens",
        "glue:ListEntities",
        "glue:DescribeEntity"
      ],
      "Resource": "*"
    }
  ]
}
```

------

If you don't want to use the above method, alternatively use the following managed IAM policies:
+ [AWSGlueServiceRole](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/service-role/AWSGlueServiceRole) – Grants access to resources that various AWS Glue processes require to run on your behalf. These resources include AWS Glue, Amazon S3, IAM, CloudWatch Logs, and Amazon EC2. If you follow the naming convention for resources specified in this policy, AWS Glue processes have the required permissions. This policy is typically attached to roles specified when defining crawlers, jobs, and development endpoints.
+ [AWSGlueConsoleFullAccess](https://console.aws.amazon.com/iam/home#policies/arn:aws:iam::aws:policy/AWSGlueConsoleFullAccess) – Grants full access to AWS Glue resources when an identity that the policy is attached to uses the AWS Management Console. If you follow the naming convention for resources specified in this policy, users have full console capabilities. This policy is typically attached to users of the AWS Glue console.

# Configuring WooCommerce
<a name="woocommerce-configuring"></a>

Before you can use AWS Glue to transfer data from WooCommerce, you must meet these requirements:

## Minimum requirements
<a name="woocommerce-configuring-min-requirements"></a>

The following are minimum requirements:
+ You have a WooCommerce account with a `consumerKey` and a `consumerSecret`.
+ Your WooCommerce account has API access with a valid license.

If you meet these requirements, you’re ready to connect AWS Glue to your WooCommerce account. For typical connections, you don't need do anything else in WooCommerce.

# Configuring WooCommerce connections
<a name="woocommerce-configuring-connections"></a>

WooCommerce supports custom authentication. For public WooCommerce documentation on generating the required API keys for custom authorization, see [Authentication – WooCommerce REST API Documentation](https://woocommerce.github.io/woocommerce-rest-api-docs/#authentication).

To configure a WooCommerce connection:

1. In AWS Secrets Manager, create a secret with the following details:
   + For a customer managed connected app, the Secret should contain the connected app Consumer Secret with `consumerKey` and `consumerSecret` as keys. Note: you must create a secret per connection in AWS Glue.

1. In AWS Glue Glue Studio, create a connection under **Data Connections** by following the steps below:

   1. When selecting a **Connection type**, select WooCommerce.

   1. Provide the `INSTANCE_URL` of the WooCommerce instance you want to connect to.

   1. Select the AWS IAM role which AWS Glue can assume and has permissions for following actions:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Effect": "Allow",
            "Action": [
              "secretsmanager:DescribeSecret",
              "secretsmanager:GetSecretValue",
              "secretsmanager:PutSecretValue",
              "ec2:CreateNetworkInterface",
              "ec2:DescribeNetworkInterfaces",
              "ec2:DeleteNetworkInterface"
            ],
            "Resource": "*"
          }
        ]
      }
      ```

------

   1. Select the `secretName` which you want to use for this connection in AWS Glue to put the tokens.

   1. Select the network options if you want to use your network.

1. Grant the IAM role associated with your AWS Glue job permission to read `secretName`.

# Reading from WooCommerce entities
<a name="woocommerce-reading-from-entities"></a>

**Prerequisite**

A WooCommerce object you would like to read from. You will need the object name such as coupon, order, product, etc.

**Supported entities for source**:


| Entity | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partitioning | 
| --- | --- | --- | --- | --- | --- | 
| Coupon | Yes | Yes | Yes | Yes | Yes | 
| Coupon Total | No | No | No | Yes | No | 
| Customers Total | No | No | No | Yes | No | 
| Order | Yes | Yes | Yes | Yes | Yes | 
| Orders Total | No | No | No | Yes | No | 
| Payment Gateway | No | No | No | Yes | No | 
| Product | Yes | Yes | Yes | Yes | Yes | 
| Product attribute | Yes | Yes | Yes | Yes | Yes | 
| Product category | Yes | Yes | Yes | Yes | Yes | 
| Product review | Yes | Yes | Yes | Yes | Yes | 
| Product shipping class | Yes | Yes | Yes | Yes | Yes | 
| Product tag | Yes | Yes | Yes | Yes | Yes | 
| Product variation | Yes | Yes | Yes | Yes | Yes | 
| Products Total | No | No | No | Yes | No | 
| Report (List) | No | No | No | Yes | No | 
| Reviews Total | No | No | No | Yes | No | 
| Sales Report | Yes | No | No | Yes | No | 
| Shipping Method | No | No | No | Yes | No | 
| Shipping Zone | No | No | No | Yes | No | 
| Shipping Zone Location | No | No | No | Yes | No | 
| Shipping Zone Method | No | No | No | Yes | No | 
| Tax Rate | Yes | Yes | Yes | Yes | Yes | 
| Tax Class | No | No | No | Yes | No | 
| Top Sellers Report | Yes | No | No | Yes | No | 

**Example**:

```
woocommerce_read = glueContext.create_dynamic_frame.from_options(
    connection_type="glue.spark.woocommerce",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "coupon",
        "API_VERSION": "v3",
        "INSTANCE_URL": "instanceUrl"
    }
```

**WooCommerce entity and field details**:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/woocommerce-reading-from-entities.html)

**Note**  
Struct and List data types are converted to String data type, and DateTime data type is converted to Timestamp in the response of the connectors.

## Partitioning queries
<a name="woocommerce-reading-partitioning-queries"></a>

**Record-based partitioning**:

You can provide the additional Spark option `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.

In record-based partitioning, the total number of records present is queried from the WooCommerce API, and divided by a `NUM_PARTITIONS` number provided. The resulting number of records are then concurrently fetched by each sub-query.
+ `NUM_PARTITIONS`: the number of partitions.

The following entities support record-based partitioning:
+ coupon
+ order
+ product
+ product-attribute
+ product-attribute-term
+ product-category
+ product-review
+ product-shipping-class
+ product-tag
+ product-variation
+ tax-rate

Example:

```
woocommerce_read = glueContext.create_dynamic_frame.from_options(
    connection_type="glue.spark.woocommerce",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "coupon",
        "API_VERSION": "v3",
        "INSTANCE_URL": "instanceUrl"
        "NUM_PARTITIONS": "10"
    }
```

**Record-based partitioning**:

The original query is splitinto `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently:
+ `NUM_PARTITIONS`: the number of partitions.

Example:

```
WooCommerce_read = glueContext.create_dynamic_frame.from_options(
    connection_type="WooCommerce",
    connection_options={
        "connectionName": "connectionName",
        "REALMID": "1234567890123456789",
        "ENTITY_NAME": "Bill",
        "API_VERSION": "v3",
        "NUM_PARTITIONS": "10"
    }
```

# WooCommerce connection options
<a name="woocommerce-connection-options"></a>

The following are connection options for WooCommerce:
+ `ENTITY_NAME`(String) - (Required) Used for Read. The name of your object in WooCommerce.
+ `API_VERSION`(String) - (Required) Used for Read. WooCommerce Rest API version you want to use.
+ `REALM_ID`(String) - An ID that identifies an individual WooCommerce Online company where you send requests.
+ `SELECTED_FIELDS`(List<String>) - Default: empty(SELECT \$1). Used for Read. Columns you want to select for the object.
+ `FILTER_PREDICATE`(String) - Default: empty. Used for Read. It should be in the Spark SQL format.
+ `QUERY`(String) - Default: empty. Used for Read. Full Spark SQL query.
+ `INSTANCE_URL`(String) - (Required) A valid WooCommerce instance URL with the format: https://<instance>.wpcomstaging.com
+ `NUM_PARTITIONS`(Integer) - Default: 1. Used for Read. Number of partitions for read.