

# CreateBulkImportJob API
<a name="ingest-bulkImport"></a>

Use the `CreateBulkImportJob` API to import large amounts of data from Amazon S3. Your data must be saved in the CSV format in Amazon S3. Data files can have the following columns.

**Note**  
 Data older than 1 January 1970 00:00:00 UTC is not supported.   
To identify an asset property, specify one of the following.  
The `ASSET_ID` and `PROPERTY_ID` of the asset property that you you're sending data to.
The `ALIAS`, which is a data stream alias (for example, `/company/windfarm/3/turbine/7/temperature`). To use this option, you must first set your asset property's alias. To learn how to set property aliases, see [Manage data streams for AWS IoT SiteWise](manage-data-streams.md).
+ `ALIAS` – The alias that identifies the property, such as an OPC UA server data stream path (for example, `/company/windfarm/3/turbine/7/temperature`). For more information, see [Manage data streams for AWS IoT SiteWise](manage-data-streams.md).
+ `ASSET_ID` – The ID of the asset.
+ `PROPERTY_ID` – The ID of the asset property.
+ `DATA_TYPE` – The property's data type can be one of the following.
  + `STRING` – A string with up to 1024 bytes.
  + `INTEGER` – A signed 32-bit integer with range [-2,147,483,648, 2,147,483,647].
  + `DOUBLE` – A floating point number with range [-10^100, 10^100] and IEEE 754 double precision.
  + `BOOLEAN` – `true` or `false`.
+ `TIMESTAMP_SECONDS` – The timestamp of the data point, in Unix epoch time.
+ `TIMESTAMP_NANO_OFFSET` – The nanosecond offset coverted from `TIMESTAMP_SECONDS`.
+ `QUALITY` – (Optional) The quality of the asset property value. The value can be one of the following.
  + `GOOD` – (Default) The data isn't affected by any issues.
  + `BAD` – The data is affected by an issue such as sensor failure.
  + `UNCERTAIN` – The data is affected by an issue such as sensor inaccuracy.

  For more information about how AWS IoT SiteWise handles data quality in computations, see [Data quality in formula expressions](expression-tutorials.md#data-quality).
+ `VALUE` – The value of the asset property.

**Example data file(s) in the .csv format**  

```
asset_id,property_id,DOUBLE,1635201373,0,GOOD,1.0
asset_id,property_id,DOUBLE,1635201374,0,GOOD,2.0
asset_id,property_id,DOUBLE,1635201375,0,GOOD,3.0
```

```
unmodeled_alias1,DOUBLE,1635201373,0,GOOD,1.0
unmodeled_alias1,DOUBLE,1635201374,0,GOOD,2.0
unmodeled_alias1,DOUBLE,1635201375,0,GOOD,3.0
unmodeled_alias1,DOUBLE,1635201376,0,GOOD,4.0
unmodeled_alias1,DOUBLE,1635201377,0,GOOD,5.0
unmodeled_alias1,DOUBLE,1635201378,0,GOOD,6.0
unmodeled_alias1,DOUBLE,1635201379,0,GOOD,7.0
unmodeled_alias1,DOUBLE,1635201380,0,GOOD,8.0
unmodeled_alias1,DOUBLE,1635201381,0,GOOD,9.0
unmodeled_alias1,DOUBLE,1635201382,0,GOOD,10.0
```

AWS IoT SiteWise provides the following API operations to create a bulk import job and get information about an existing job.
+ [CreateBulkImportJob](https://docs.aws.amazon.com/iot-sitewise/latest/APIReference/API_CreateBulkImportJob.html) – Creates a new bulk import job.
+ [DescribeBulkImportJob](https://docs.aws.amazon.com/iot-sitewise/latest/APIReference/API_DescribeBulkImportJob.html) – Retrieves information about a bulk import job.
+ [ListBulkImportJob](https://docs.aws.amazon.com/iot-sitewise/latest/APIReference/API_ListBulkImportJobs.html) – Retrieves a paginated list of summaries of all bulk import jobs.

# Create an AWS IoT SiteWise bulk import job (AWS CLI)
<a name="CreateBulkImportJob"></a>

Use the [CreateBulkImportJob](https://docs.aws.amazon.com/iot-sitewise/latest/APIReference/API_CreateBulkImportJob.html) API operation to transfer data from Amazon S3 to AWS IoT SiteWise. The [CreateBulkImportJob](https://docs.aws.amazon.com/iot-sitewise/latest/APIReference/API_CreateBulkImportJob.html) API enables ingestion of large volumes of historical data, and buffered ingestion of analytical data streams in small batches. It provides a cost-effective primitive for data ingestion. The following example uses the AWS CLI.

**Important**  
Before creating a bulk import job, you must enable AWS IoT SiteWise warm tier or AWS IoT SiteWise cold tier. For more information, see [Configure storage settings in AWS IoT SiteWise](configure-storage.md).  
 The [CreateBulkImportJob](https://docs.aws.amazon.com/iot-sitewise/latest/APIReference/API_CreateBulkImportJob.html) API supports ingestion of historical data into AWS IoT SiteWise with the option to set the adaptive-ingestion-flag parameter.   
When set to `false`, the API ingests historical data without triggering computations or notifications.
When set to `true`, the API ingests new data, calculating metrics and transforming the data to optimize ongoing analytics and notifications within seven days.

Run the following command. Replace *file-name* with the name of the file that contains the bulk import job configuration.

```
aws iotsitewise create-bulk-import-job --cli-input-json file://file-name.json
```

**Example Bulk import job configuration**  
The following are examples of configuration settings:  
+ Replace *adaptive-ingestion-flag* with `true` or `false`.
  + If set to `false`, the bulk import job ingests historical data into AWS IoT SiteWise.
  + If set to `true`, the bulk import job does the following:
    + Ingests new data into AWS IoT SiteWise.
    + Calculates metrics and transforms, and supports notifications for data with a time stamp that's within seven days.
+ Replace *delete-files-after-import-flag* with `true` to delete the data from the Amazon S3 data bucket after ingesting into AWS IoT SiteWise warm tier storage.
+ Replace amzn-s3-demo-bucket*-for-errors* with the name of the Amazon S3 bucket to which errors associated with this bulk import job are sent.
+ Replace amzn-s3-demo-bucket*-for-errors-prefix* with the prefix of the Amazon S3 bucket to which errors associated with this bulk import job are sent. 

  Amazon S3 uses the prefix as a folder name to organize data in the bucket. Each Amazon S3 object has a key that is its unique identifier in the bucket. Each object in a bucket has exactly one key. The prefix must end with a forward slash (/). For more information, see [Organizing objects using prefixes](https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-prefixes.html) in the *Amazon Simple Storage Service User Guide*.
+ Replace amzn-s3-demo-bucket*-data* with the name of the Amazon S3 bucket from which data is imported.
+ Replace *data-bucket-key* with the key of the Amazon S3 object that contains your data. Each object has a key that is a unique identifier. Each object has exactly one key.
+ Replace *data-bucket-version-id* with the version ID to identify a specific version of the Amazon S3 object that contains your data. This parameter is optional.
+ Replace *column-name* with the column name specified in the .csv file.
+ Replace *job-name* with a unique name that identifies the bulk import job.
+ Replace *job-role-arn* with the IAM role that allows AWS IoT SiteWise to read Amazon S3 data.
Make sure that your role has the permissions shown in the following example. Replace amzn-s3-demo-bucket*-data* with the name of the Amazon S3 bucket that contains your data. Also, replace *amzn-s3-demo-bucket-for-errors* with the name of the Amazon S3 bucket to which errors associated with this bulk import job are sent.    
****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Action": [
                "s3:GetObject",
                "s3:GetBucketLocation"
            ],
            "Resource": [
                "arn:aws:s3:::amzn-s3-demo-bucket-data",
                "arn:aws:s3:::amzn-s3-demo-bucket-data/*"
            ],
            "Effect": "Allow"
        },
        {
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:GetBucketLocation"
            ],
            "Resource": [
                "arn:aws:s3:::amzn-s3-demo-bucket-for-errors",
                "arn:aws:s3:::amzn-s3-demo-bucket-for-errors/*"
            ],
            "Effect": "Allow"
        }
    ]
}
```

```
{
   "adaptiveIngestion": adaptive-ingestion-flag,
   "deleteFilesAfterImport": delete-files-after-import-flag,       
   "errorReportLocation": { 
      "bucket": "amzn-s3-demo-bucket-for-errors",
      "prefix": "amzn-s3-demo-bucket-for-errors-prefix"
   },
   "files": [ 
      { 
         "bucket": "amzn-s3-demo-bucket-data",
         "key": "data-bucket-key",
         "versionId": "data-bucket-version-id"
      }
   ],
   "jobConfiguration": { 
      "fileFormat": { 
         "csv": { 
            "columnNames": [ "column-name" ]
         }
      }
   },
   "jobName": "job-name",
   "jobRoleArn": "job-role-arn"    
}
```

**Example response**  

```
{
   "jobId":"f8c031d0-01d1-4b94-90b1-afe8bb93b7e5",
   "jobStatus":"PENDING",
   "jobName":"myBulkImportJob"
}
```

# Describe an AWS IoT SiteWise bulk import job (AWS CLI)
<a name="DescribeBulkImportJob"></a>

Use the [DescribeBulkImportJob](https://docs.aws.amazon.com/iot-sitewise/latest/APIReference/API_DescribeBulkImportJob.html) API operation to retrieve information about a specific bulk import job in AWS IoT SiteWise. This operation returns details such as the job's status, creation time, and error information if the job failed. You can use this operation to monitor job progress and troubleshoot issues. To use `DescribeBulkImportJob`, you need the job ID from the `CreateBulkImportJob` operation. The API returns the following information:
+ List of files being imported, including their Amazon S3 bucket locations and keys
+ Error report location (if applicable)
+ Job configuration details, such as file format and CSV column names
+ Job creation and last update timestamps
+ Current job status (for example, whether the job is in progress, completed, or failed)
+ IAM role ARN used for the import job

For completed jobs, review the results to confirm successful data integration. If a job fails, examine the error details to diagnose and resolve issues.

Replace *job-ID* with the ID of the bulk import job that you want to retrieve.

```
aws iotsitewise describe-bulk-import-job --job-id job-ID
```

**Example response**  

```
{
   "files":[
      {
         "bucket":"amzn-s3-demo-bucket1",
         "key":"100Tags12Hours.csv"
      },
      {
         "bucket":"amzn-s3-demo-bucket2",
         "key":"BulkImportData1MB.csv"
      },
      {
         "bucket":"	amzn-s3-demo-bucket3",
         "key":"UnmodeledBulkImportData1MB.csv"
      }
   ],
   "errorReportLocation":{
      "prefix":"errors/",
      "bucket":"amzn-s3-demo-bucket-for-errors"
   },
   "jobConfiguration":{
      "fileFormat":{
         "csv":{
            "columnNames":[
               "ALIAS",
               "DATA_TYPE",
               "TIMESTAMP_SECONDS",
               "TIMESTAMP_NANO_OFFSET",
               "QUALITY",
               "VALUE"
            ]
         }
      }
   },
   "jobCreationDate":1645745176.498,
   "jobStatus":"COMPLETED",
   "jobName":"myBulkImportJob",
   "jobLastUpdateDate":1645745279.968,
   "jobRoleArn":"arn:aws:iam::123456789012:role/DemoRole",
   "jobId":"f8c031d0-01d1-4b94-90b1-afe8bb93b7e5"
}
```

# List AWS IoT SiteWise bulk import jobs (AWS CLI)
<a name="ListBulkImportJobs"></a>

Use the [ListBulkImportJobs](https://docs.aws.amazon.com/iot-sitewise/latest/APIReference/API_ListBulkImportJobss.html) API operation to retrieve a list of summaries for bulk import jobs in AWS IoT SiteWise. This operation provides an efficient way to monitor and manage your data import processes. It returns the following key information for each job:
+ Job ID. A unique identifier for each bulk import job
+ Job name. The name you assigned to the job when creating it
+ Current status. The job's current state (for example, COMPLETED, RUNNING, FAILED)

ListBulkImportJobs is particularly useful for getting a comprehensive overview of all your bulk import jobs. This can help you track multiple data imports, identify any jobs that require attention, and maintain an organized workflow. The operation supports pagination, allowing you to retrieve large numbers of job summaries efficiently. You can use the job IDs returned by this operation with the [DescribeBulkImportJob](https://docs.aws.amazon.com/iot-sitewise/latest/APIReference/API_DescribeBulkImportJob.html) operation to retrieve more detailed information about specific jobs. This two-step process allows you to first get a high-level view of all jobs, and then drill down into the details of jobs of interest. When using `ListBulkImportJobs`, you can apply filters to narrow down the results. For example, you can filter jobs based on their status to retrieve only completed jobs or only running jobs. This feature helps you focus on the most relevant information for your current task. The operation also returns a `nextToken` if there are more results available. You can use this token in subsequent calls to retrieve the next set of job summaries, enabling you to iterate through all your bulk import jobs even if you have a large number of them. The following example demonstrates how to use `ListBulkImportJobs` with the AWS CLI to retrieve a list of completed jobs.

```
aws iotsitewise list-bulk-import-jobs --filter COMPLETED
```

**Example Response for completed jobs filter**  

```
{
   "jobSummaries":[
      {
         "id":"bdbbfa52-d775-4952-b816-13ba1c7cb9da",
         "name":"myBulkImportJob",
         "status":"COMPLETED"
      },
      {
         "id":"15ffc641-dbd8-40c6-9983-5cb3b0bc3e6b",
         "name":"myBulkImportJob2",
         "status":"COMPLETED"
      }
   ]
}
```

This command demonstrates how to use `ListBulkImportJobs` to retrieve a list of jobs that completed with failures. The maximum is set to 50 results and we're using a next token for paginated results.

```
aws iotsitewise list-bulk-import-jobs --filter COMPLETED_WITH_FAILURES --max-results 50 --next-token "string"
```