

# Reading from HubSpot entities
<a name="hubspot-reading-from-entities"></a>

**Prerequisite**

A HubSpot object you would like to read from. You will need the object name such as contact or task. The following table shows the supported entities for Sync source.

## Supported entities for Sync source
<a name="sync-table"></a>


| Entity | API version | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partioning | 
| --- | --- | --- | --- | --- | --- | --- | 
| Campaigns | v1 | No | Yes | No | Yes | No | 
| Companies | v3 | Yes | Yes | Yes | Yes | Yes | 
| Contacts | v3 | Yes | Yes | Yes | Yes | Yes | 
| Contact Lists | v1 | No | Yes | No | Yes | No | 
| Deals | v3 | Yes | Yes | Yes | Yes | Yes | 
| CRM Pipeline (Deal Pipelines) | v1 | No | No | No | Yes | No | 
| Email Events | v1 | No | Yes | No | Yes | No | 
| Calls | v3 | Yes | Yes | Yes | Yes | Yes | 
| Notes | v3 | Yes | Yes | Yes | Yes | Yes | 
| Emails | v3 | Yes | Yes | Yes | Yes | Yes | 
| Meetings | v3 | Yes | Yes | Yes | Yes | Yes | 
| Tasks | v3 | Yes | Yes | Yes | Yes | Yes | 
| Postal Mails | v3 | Yes | Yes | Yes | Yes | Yes | 
| Custom Objects | v3 | Yes | Yes | Yes | Yes | Yes | 
| Forms | v2 | No | No | No | Yes | No | 
| Owners | v3 | No | Yes | No | Yes | No | 
| Products | v3 | Yes | Yes | Yes | Yes | Yes | 
| Tickets | v3 | Yes | Yes | Yes | Yes | Yes | 
| Workflows | v3 | No | No | No | Yes | No | 
| Associations | v4 | Yes | No | No | Yes | No | 
| Associations Labels | v4 | No | No | No | Yes | No | 

**Example**:

```
hubspot_read = glueContext.create_dynamic_frame.from_options(
    connection_type="hubspot",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "contact",
        "API_VERSION": "v3"
    }
```

## Supported entities for Async source
<a name="async-table"></a>


| Entity | API version | Can be filtered | Supports limit | Supports Order by | Supports Select \$1 | Supports partioning | 
| --- | --- | --- | --- | --- | --- | --- | 
| Companies | v3 | Yes | No | Yes | Yes | No | 
| Contacts | v3 | Yes | No | Yes | Yes | No | 
| Deals | v3 | Yes | No | Yes | Yes | No | 
| Calls | v3 | Yes | No | Yes | Yes | No | 
| Notes | v3 | Yes | No | Yes | Yes | No | 
| Emails | v3 | Yes | No | Yes | Yes | No | 
| Meetings | v3 | Yes | No | Yes | Yes | No | 
| Tasks | v3 | Yes | No | Yes | Yes | No | 
| Postal Mails | v3 | Yes | No | Yes | Yes | No | 
| Custom Objects | v3 | Yes | No | Yes | Yes | No | 
| Products | v3 | Yes | No | Yes | Yes | No | 
| Tickets | v3 | Yes | No | Yes | Yes | No | 

**Example**:

```
hubspot_read = glueContext.create_dynamic_frame.from_options(
    connection_type="hubspot",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "contact",
        "API_VERSION": "v3",
        "TRANSFER_MODE": "ASYNC"
    }
```

**HubSpot entity and field details**:

**HubSpot API v4**: 

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/hubspot-reading-from-entities.html)

**Note**  
For the `Associations` object, to fetch associations between two objects, you need to provide the 'from Id' (the ID of the first object) via a mandatory filter while creating an AWS Glue job. If you want to fetch associations for multiple from IDs in that case, you have to provide multiple IDs in the `where` clause. For example: for fetching `Associations` for contact IDs '1' and '151', you need to provide a filter as `where id=1 AND id=151`.

**HubSpot API v3**:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/hubspot-reading-from-entities.html)

For the following entities, HubSpot provides endpoints to fetch metadata dynamically, so that operator support is captured at the datatype level for each entity.

**Note**  
`DML_STATUS` is a virtual field added on every record at runtime to determine its status (CREATED/UPDATED) in the Sync mode. The `CONTAINS/LIKE` operator is not supported in the Async mode.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/hubspot-reading-from-entities.html)

**HubSpot API v2**:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/hubspot-reading-from-entities.html)

**HubSpot API v1**:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/hubspot-reading-from-entities.html)

## Partitioning queries
<a name="hubspot-reading-partitioning-queries"></a>

You can provide the additional Spark options `PARTITION_FIELD`, `LOWER_BOUND`, `UPPER_BOUND`, and `NUM_PARTITIONS` if you want to utilize concurrency in Spark. With these parameters, the original query would be split into `NUM_PARTITIONS` number of sub-queries that can be executed by Spark tasks concurrently.
+ `PARTITION_FIELD`: the name of the field to be used to partition the query.
+ `LOWER_BOUND`: an **inclusive** lower bound value of the chosen partition field.

  For the DateTime field, we accept the value in ISO format.

  Examples of valid value:

  ```
  “2024-01-01T10:00:00.115Z" 
  ```
+ `UPPER_BOUND`: an **exclusive** upper bound value of the chosen partition field.
+ `NUM_PARTITIONS`: the number of partitions.

The following table describes the entity partitioning field support details:

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/hubspot-reading-from-entities.html)

Example:

```
hubspot_read = glueContext.create_dynamic_frame.from_options(
    connection_type="hubspot",
    connection_options={
        "connectionName": "connectionName",
        "ENTITY_NAME": "company",
        "API_VERSION": "v3",
        "PARTITION_FIELD": "hs_object_id"
        "LOWER_BOUND": "50"
        "UPPER_BOUND": "16726619290"
        "NUM_PARTITIONS": "10"
    }
```