Amazon Fraud Detector is no longer open to new customers as of November 7, 2025. For capabilities similar to Amazon Fraud Detector, explore Amazon SageMaker, AutoGluon, and AWS WAF.

# Event data storage
<a name="event-data-storage"></a>

After you've gathered your dataset, you store your dataset internally using Amazon Fraud Detector or externally with Amazon Simple Storage Service (Amazon S3). We recommend that you choose where to store your dataset based on the model you use for generating fraud predictions. The following is a detailed breakdown of these two storage options.
+ **Internal storage- **Your dataset is stored with Amazon Fraud Detector. All event data associated with an event is stored together. You can upload the event dataset that’s stored with Amazon Fraud Detector at any time. You can either stream events one at a time to an Amazon Fraud Detector API, or import large datasets (up to 1GB) using the batch import feature. When you train a model using the dataset stored with Amazon Fraud Detector, you can specify a time range to limit the size of your dataset. 
+ **External storage- **Your dataset is stored in an external data source other than Amazon Fraud Detector. Currently, Amazon Fraud Detector supports using Amazon Simple Storage Service(Amazon S3) for this purpose. If your model is on a file that’s uploaded to Amazon S3, that file can’t be more than 5GB of uncompressed data. If it’s more than that, make sure to shorten the time range of your dataset.

The following table provides details about the model type and the data source it supports.


| Model type | Compatible training data source | 
| --- | --- | 
|  Online Fraud Insights  |  External storage, Internal storage  | 
|  Transaction Fraud Insights  |  Internal storage  | 
|  Account Takeover Insights  |  Internal storage  | 

For information on storing your dataset externally with Amazon Simple Storage Service, see [Store your event data externally with Amazon S3](uploading-event-data-to-an-s3-bucket.md). For information on storing your dataset internally with Amazon Fraud Detector see [Store your event data internally with Amazon Fraud Detector](storing-event-data-afd.md).

# Store your event data externally with Amazon S3
<a name="uploading-event-data-to-an-s3-bucket"></a>

If you are training an Online Fraud Insights model, you can choose to store your event data externally with Amazon S3. To store your event data in Amazon S3 you must first create a text file in CSV format, add your event data, and then upload the CSV file to an Amazon S3 bucket. 

**Note**  
The **Transaction Fraud Insights** and **Account Takeover Insights** model types do not support datasets stored externally with Amazon S3

# Create CSV file
<a name="creating-csv-file"></a>

Amazon Fraud Detector requires that the first row of your CSV file contain column headers. The column headers in your CSV file must map to the variables that are defined in the event type. For an example dataset, see [Get and upload example dataset](step-1-get-s3-data.md) 

The Online Fraud Insights model requires a training dataset that has at least 2 variables and up to 100 variables. In addition to the event variables, the training dataset must contain the following headers:
+ EVENT\$1TIMESTAMP - Defines when the event occurred
+ EVENT\$1LABEL - Classifies the event as fraudulent or legitimate. The values in the column must correspond to the values defined in the event type.

The following sample CSV data represents historical registration events from an online merchant: 

```
EVENT_TIMESTAMP,EVENT_LABEL,ip_address,email_address
4/10/2019 11:05,fraud,209.146.137.48,fake_burtonlinda@example.net
12/20/2018 20:04,legit,203.0.112.189,fake_davidbutler@example.org
3/14/2019 10:56,legit,169.255.33.54,fake_shelby76@example.net
1/3/2019 8:38,legit,192.119.44.26,fake_curtis40@example.com
9/25/2019 3:12,legit,192.169.85.29,fake_rmiranda@example.org
```

**Note**  
The CSV data file can contain double quotes and commas as part of your data. 

A simplified version of the corresponding event type is represented below. The event variables correspond to the headers in the CSV file and the values in `EVENT_LABEL` correspond to the values in the labels list.

```
(
name = 'sample_registration',
eventVariables = ['ip_address', 'email_address'],
labels = ['legit', 'fraud'],
entityTypes = ['sample_customer']
)
```

## Event Timestamp formats
<a name="timestamp-formats"></a>

Ensure that your event timestamp is in the required format. As part of the model build process, the Online Fraud Insights model type orders your data based on the event timestamp, and splits your data for training and testing purposes. To get a fair estimate of performance, the model first trains on the training dataset, and then tests this model on the test dataset.

Amazon Fraud Detector supports the following date/timestamp formats for the values in `EVENT_TIMESTAMP` during model training:
+ %yyyy-%mm-%ddT%hh:%mm:%ssZ (ISO 8601 standard in UTC only with no milliseconds)

  Example: 2019-11-30T13:01:01Z 
+ %yyyy/%mm/%dd %hh:%mm:%ss (AM/PM)

  Examples: 2019/11/30 1:01:01 PM, or 2019/11/30 13:01:01 
+ %mm/%dd/%yyyy %hh:%mm:%ss

  Examples: 11/30/2019 1:01:01 PM, 11/30/2019 13:01:01 
+ %mm/%dd/%yy %hh:%mm:%ss

  Examples: 11/30/19 1:01:01 PM, 11/30/19 13:01:01 

Amazon Fraud Detector makes the following assumptions when parsing date/timestamp formats for event timestamps:
+ If you are using the ISO 8601 standard, it must be an exact match of the preceding specification
+ If you are using one of the other formats, there is additional flexibility:
  + For months and days, you can provide single or double digits. For example, 1/12/2019 is a valid date.
  + You do not need to include hh:mm:ss if you do not have them (taht is, you can simply provide a date). You can also provide a subset of just the hour and minutes (for example, hh:mm). Just providing hour is not supported. Milliseconds are also not supported.
  + If you provide AM/PM labels, a 12-hour clock is assumed. If there is no AM/PM information, a 24-hour clock is assumed.
  + You can use “/” or “-” as delimiters for the date elements. “:” is assumed for the timestamp elements.

## Sampling your dataset across time
<a name="sample-your-dataset"></a>

We recommend that you provide examples of fraud and legitimate samples from the same time range. For example, if you provide fraud events from the past 6 months, you should also provide legitimate events that evenly span the same time period. If your dataset contains a highly uneven distribution of fraud and legitimate events, you might receive the following error: *"The fraud distribution across time is unacceptably fluctuant. Cannot split dataset properly."* Typically, the easiest fix for this error is to ensure that the fraud events and legitimate events are sampled evenly across the same timeframe. You also might need to remove data if you experienced a large spike in fraud within a short time period. 

If you cannot generate enough data to create an evenly distributed dataset, one approach is to randomize the EVENT\$1TIMESTAMP of your events such that they are evenly distributed. However, this often results in performance metrics being unrealistic because Amazon Fraud Detector uses EVENT\$1TIMESTAMP to evaluate models on the appropriate subset of events in your dataset. 

## Null and missing values
<a name="null-missing-values"></a>

Amazon Fraud Detector handles null and missing values. However, the percentage of nulls for variables should be limited. EVENT\$1TIMESTAMP and EVENT\$1LABEL columns should not contain any missing values.

## File validation
<a name="csv-file-validation"></a>

Amazon Fraud Detector will fail to train a model if any of the following conditions are triggered:
+ If the CSV is unable to be parsed
+ If the datatype for a column is incorrect

# Upload your event data to an Amazon S3 bucket
<a name="uploading-to-an-s3-bucket"></a>

After you create a CSV file with your event data, upload the file to your Amazon S3 bucket.

**To upload to an Amazon S3 bucket**

1. Sign in to the AWS Management Console and open the Amazon S3 console at [https://console.aws.amazon.com/s3/](https://console.aws.amazon.com/s3/).

1. Choose **Create bucket**.

   The **Create bucket** wizard opens.

1. In **Bucket name**, enter a DNS-compliant name for your bucket.

   The bucket name must:
   + Be unique across all of Amazon S3.
   + Be between 3 and 63 characters long.
   + Not contain uppercase characters.
   + Start with a lowercase letter or number.

   After you create the bucket, you can't change its name. For information about naming buckets, see [ Bucket naming rules](https://docs.aws.amazon.com/AmazonS3/latest/userguide/BucketRestrictions.html#bucketnamingrules) in the *Amazon Simple Storage Service User Guide*.
**Important**  
Avoid including sensitive information, such as account numbers, in the bucket name. The bucket name is visible in the URLs that point to the objects in the bucket.

1. In **Region**, choose the AWS Region where you want the bucket to reside. You must select the same Region in which you are using Amazon Fraud Detector, that is US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Ireland), Asia Pacific (Singapore) or Asia Pacific (Sydney). 

1. In **Bucket settings for Block Public Access**, choose the Block Public Access settings that you want to apply to the bucket. 

   We recommend that you leave all settings enabled. For more information about blocking public access, see [Blocking public access to your Amazon S3 storage](https://docs.aws.amazon.com/AmazonS3/latest/dev/access-control-block-public-access.html) in the *Amazon Simple Storage Service User Guide*.

1. Choose **Create bucket**.

1. Upload training data file to your Amazon S3 bucket. Note the Amazon S3 location path for your training file (for example, s3://bucketname/object.csv).

# Store your event data internally with Amazon Fraud Detector
<a name="storing-event-data-afd"></a>

You can choose to store event data in Amazon Fraud Detector and use the stored data later to train your models. By storing event data in Amazon Fraud Detector, you can train models that use auto-computed variables to improve performance, simplify model retraining, and update fraud labels to close the machine learning feedback loop. Events are stored at the Event Type resource level, so all events of the same event type are stored together in a single event type dataset. As part of defining an event type, you can optionally specify whether to store events for that event type by toggling the *Event Ingestion* setting in the Amazon Fraud Detector console. 

You can either store single events or import large number of event datasets in Amazon Fraud Detector. Single events can be streamed using the [GetEventPrediction](https://docs.aws.amazon.com//frauddetector/latest/api/API_GetEventPrediction.html) API or the [SendEvent](https://docs.aws.amazon.com//frauddetector/latest/api/API_SendEvent.html) API. Large datasets can be quickly and easily imported to Amazon Fraud Detector using the batch import feature in the Amazon Fraud Detector console or using the [CreateBatchImportJob](https://docs.aws.amazon.com//frauddetector/latest/api/API_CreateBatchImportJob.html) API.

You can use the Amazon Fraud Detector console at any time to check the number of events already stored for each event type.

# Prepare event data for storage
<a name="prepare-storage-event-data"></a>

Event data that is stored internally with Amazon Fraud Detector is stored at the `Event Type` resource level. So, all event data that are from the same event are stored in a single `Event Type`. The stored events can later be used to train a new model or re-train an existing model. When training a model using the stored event data, you can optionally specify a time range of events to limit the size of your training dataset. 

Each time you store your data in Amazon Fraud Detector, using the Amazon Fraud Detector console, the `SendEvent` API, or the `CreateBatchImportJob` API, Amazon Fraud Detector validates your data before storing. If your data fails validation, the event data is not stored.

**Prerequisites for storing data internally with Amazon Fraud Detector**
+ To ensure that your event data passes validation and the dataset gets stored successfully, make sure you have used the insights provided by [Data models explorer](https://docs.aws.amazon.com/frauddetector/latest/ug/create-event-dataset.html#prepare-event-dataset) to prepare your dataset. 
+ Created an event type for the event data you want to store with Amazon Fraud Detector. If you haven't, follow intstructions to [Create an event type](https://docs.aws.amazon.com//frauddetector/latest/ug/create-event-type.html).

## Smart Data Validation
<a name="smart-data-validation"></a>

When you upload your dataset in Amazon Fraud Detector console for batch import, Amazon Fraud Detector uses Smart Data Validation (SDV) to validate your dataset before importing your data. SDV scans the uploaded data file and identifies issues such as missing data, and incorrect format or data types. In addition to validating your dataset, SDV also provides a validation report that lists all issues that were identified and suggests actions to fix issues that are most impactful. Some of the issues identified by SDV might be critical and must be addressed before Amazon Fraud Detector can successfully import your dataset. For more information, see [Smart Data Validation report](storing-events-batch-import.md#sdv-validation-report). 

The SDV validates your dataset at the file level and at the data (row) level. At the file level, SDV scans your data file and identifies issues such as inadequate permissions to access the file, incorrect file size, file format, and headers (event metadata and event variables). At the data level, SDV scans each event data (row) and identifies issues such as incorrect data format, data length, timestamp format, and null values. 

Smart Data Validation is currently available in the Amazon Fraud Detector console only and the validation is turned on by default. If you don't want Amazon Fraud Detector to use the Smart Data Validation before importing your dataset, turn off the validation in the Amazon Fraud Detector console when uploading your dataset. 

## Validating stored data when using APIs or AWS SDK
<a name="validating-stored-data-api"></a>

When uploading events via the `SendEvent`, `GetEventPrediction`, or `CreateBatchImportJob `API operation, Amazon Fraud Detector validates the following:
+ The EventIngestion setting for that event type is ENABLED.
+ Event timestamps cannot be updated. An event with a repeated event ID and different EVENT\$1TIMESTAMP will be treated as an error.
+ Variable names and values match their expected format. For more information, see [Create a variable](create-a-variable.md)
+ Required variables are populated with a value.
+ All event timestamps are not older than 18 months and are not in the future.

# Store event data using batch import
<a name="storing-events-batch-import"></a>

With the batch import feature, you can quickly and easily upload large historical event datasets in Amazon Fraud Detector using the console, the API, or the AWS SDK. To use batch import, create an input file in CSV format that contains all your event data, upload the CSV file onto Amazon S3 bucket, and start an *Import* job. Amazon Fraud Detector first validates the data based on the event type, and then automatically imports the entire dataset. After the data is imported, it’s ready to be used for training new models or for re-training existing models.

## Input and output files
<a name="input-output-batch"></a>

The input CSV file must contain headers that match the variables defined in the associated event type plus four mandatory variables. See [Prepare event data for storage](prepare-storage-event-data.md) for more information. The maximum size of the input data file is 20 Gigabytes (GB), or about 50 million events. The number of events will vary by your event size. If the import job was successful, the output file is empty. If the import was unsuccessful, the output file contains the error logs. 

## Create a CSV file
<a name="create-csv-stored-data"></a>

Amazon Fraud Detector imports data only from files that are in the comma-separated values (CSV) format. The first row of your CSV file must contain column headers that exactly match the variables defined in the associated event type plus four mandatory variables: EVENT\$1ID, EVENT\$1TIMESTAMP, ENTITY\$1ID, and ENTITY\$1TYPE. You can also optionally include EVENT\$1LABEL and LABEL\$1TIMESTAMP (LABEL\$1TIMESTAMP is required if EVENT\$1LABEL is included). 

**Define mandatory variables**

Mandatory variables are considered as event metadata and they must be specified in uppercase. Event metadata are automatically included for model training. The following table lists the mandatory variables, description of each variable, and required format for the variable.


| Name | Description | Requirements | 
| --- | --- | --- | 
|  EVENT\$1ID  |  An identifier for the event. For example, if your event is an online transaction, the EVENT\$1ID might be the transaction reference number that was provided to your customer.  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/frauddetector/latest/ug/storing-events-batch-import.html)  | 
|  EVENT\$1TIMESTAMP  |  The timestamp of when the event occurred. The timestamp must be in ISO 8601 standard in UTC.  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/frauddetector/latest/ug/storing-events-batch-import.html)  | 
|  ENTITY\$1ID  |  An identifier for the entity performing the event.  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/frauddetector/latest/ug/storing-events-batch-import.html)  | 
|  ENTITY\$1TYPE  |  The entity that performs the event, such as a merchant or a customer  |  ENTITY\$1TYPE is required for batch import jobs  | 
|  EVENT\$1LABEL  |  Classifies the event as `fraudulent` or `legitimate`  |  EVENT\$1LABEL is required if LABEL\$1TIMESTAMP is included  | 
|  LABEL\$1TIMESTAMP  |  The timestamp when the event label was last populated or updated  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/frauddetector/latest/ug/storing-events-batch-import.html)  | 

## Upload CSV file to Amazon S3 for batch import
<a name="upload-csv-S3-for-batch-import"></a>

After you create a CSV file with your data, upload the file to your Amazon Simple Storage Service (Amazon S3) bucket.

**To upload event data to an Amazon S3 bucket**

1. Sign in to the AWS Management Console and open the Amazon S3 console at [https://console.aws.amazon.com/s3/](https://console.aws.amazon.com/s3/).

1. Choose **Create bucket**.

   The **Create bucket** wizard opens.

1. In **Bucket name**, enter a DNS-compliant name for your bucket.

   The bucket name must:
   + Be unique across all of Amazon S3.
   + Be between 3 and 63 characters long.
   + Not contain uppercase characters.
   + Start with a lowercase letter or number.

   After you create the bucket, you can't change its name. For information about naming buckets, see [ Bucket naming rules](https://docs.aws.amazon.com/AmazonS3/latest/userguide/BucketRestrictions.html#bucketnamingrules) in the *Amazon Simple Storage Service User Guide*.
**Important**  
Avoid including sensitive information, such as account numbers, in the bucket name. The bucket name is visible in the URLs that point to the objects in the bucket.

1. In **Region**, choose the AWS Region where you want the bucket to reside. You must select the same Region in which you are using Amazon Fraud Detector, that is US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Ireland), Asia Pacific (Singapore) or Asia Pacific (Sydney). 

1. In **Bucket settings for Block Public Access**, choose the Block Public Access settings that you want to apply to the bucket. 

   We recommend that you leave all settings enabled. For more information about blocking public access, see [Blocking public access to your Amazon S3 storage](https://docs.aws.amazon.com/AmazonS3/latest/dev/access-control-block-public-access.html) in the *Amazon Simple Storage Service User Guide*.

1. Choose **Create bucket**.

1. Upload training data file to your Amazon S3 bucket. Note the Amazon S3 location path for your training file (for example, s3://bucketname/object.csv).

## Batch import event data in Amazon Fraud Detector console
<a name="batch-import-event-data"></a>

You can easily import large number of your event datasets in Amazon Fraud Detector console, using the `CreateBatchImportJob` API or using AWS SDK. Before you proceed, make sure that you have followed instructions to prepare your dataset as a CSV file. Make sure that you also uploaded the CSV file to an Amazon S3 bucket.

**Using Amazon Fraud Detector console**

**To batch import event data in console**

1. Open the AWS Console and sign in to your account, and navigate to Amazon Fraud Detector.

1. In the left navigation pane, choose **Events**.

1. Choose your event type.

1. Select **Stored events** tab.

1. In the **Stored events details** pane, make sure that the **Event ingestion** is **ON**.

1. In the **Import events data** pane, choose **New Import**.

1. In the **New events import** page, provide the following information:
   + [Recommended] Leave **Enable Smart Data Validation for this dataset - new** set to the default setting.
   + For **IAM role for data**, select the IAM role that you created for the Amazon S3 bucket that holds the CSV file you are planning to import.
   + For **Input data location**, enter the S3 location where you have your CSV file. 
   + If you want to specify a separate location to store your import results, click **Separate data location for inputs and results** button and provide a valid Amazon S3 bucket location.
**Important**  
Make sure that the IAM role you selected has read permissions to your input Amazon S3 bucket and write permissions to your output Amazon S3 bucket.

1. Choose **Start**.

1. The **Status** column in **Import events data** pane displays the status of your validation and import job. The banner at the top provides high level description of the status as your dataset first goes through validation and then the import.

1. Follow the guidance provided to [Monitor the progress of dataset validation and import job](#monitor-progress-sdv).

### Monitor the progress of dataset validation and import job
<a name="monitor-progress-sdv"></a>

If you are using the Amazon Fraud Detector console to perform a batch import job, by default, Amazon Fraud Detector validates your dataset before import. You can monitor the progress and status of validation and import jobs in the **New events import** page of the Amazon Fraud Detector console. A banner at the top of the page provides a brief description of the validation findings and the status of the import job. Depending on the validation findings and the status of your import job you might be required to take actions to ensure successful validation and import of your dataset.

The following table provides details of the actions you must take depending on the outcome of validation and import operations.


| Banner message | Status | What it means | What should I do | 
| --- | --- | --- | --- | 
| Data validation has started | Validation in progress | SDV has started validating your dataset | Wait for the status to change | 
| Data validation cannot proceed due to errors in your dataset. Fix errors in your data file and start a new import job. See the validation report for more information | Validation failed | SDV identified issues in your data file. These issues must be addressed for successful import of your dataset. | In the Import events data pane, select the Job Id and view the validation report. Follow the Recommendations in the report to address all the errors listed. For more information, see [Using the validation report](#using-sdv-validation-report). | 
| Data import has started. Validation completed successfully | Import in progress | Your dataset passed the validation. AFD has started to import your dataset | Wait for the status to change | 
| Validation completed with warnings. Data import has started | Import in progress | Some of the data in your dataset failed validation. However, the data that passed validation meets the minimum data size requirements for import. | Monitor the message in the banner and wait for the status to change | 
| Your data was partially imported. Some of the data failed validation and did not get imported. See validation report for more information. | Imported. The status shows a warning icon. | Some of the data in your data file that failed validation did not get imported. The rest of the data that passed validation was imported. | In the Import events data pane, select the Job Id and view the validation report. Follow the Recommendations in the Data level warnings table to address the listed warnings. You need not address all the warnings. However, make sure that your dataset has more than 50% of data that passes validation for a successful import. After you have addressed the warnings, start a new import job. For more information, see [Using the validation report](#using-sdv-validation-report). | 
| Data import failed due to a processing error. Start a new data import job | Import failed | The import failed due to a transient run-time error | Start a new import job | 
| Data was imported successfully | Imported | Both validation and import completed successfully | Select the Job Id of your import job to view details and then proceed with model training | 

**Note**  
We recommend waiting 10 minutes after the dataset has imported successfully into Amazon Fraud Detector to ensure that they are fully ingested by the system.

### Smart Data Validation report
<a name="sdv-validation-report"></a>

The Smart Data Validation creates a validation report after validation is complete. The validation report provides details of all the issues that the SDV has identified in your dataset, with suggested actions to fix the most impactful issues. You can use the validation report to determine what the issues are, where the issues are located in the dataset, the severity of the issues, and how to fix them. The validation report is created even when the validation completes successfully. In this case, you can view the report to see if there are any issues listed and if there are, decide if you want to fix any of those.

**Note**  
The current version of SDV scans your dataset for issues that might cause the batch import to fail. If validation and batch import succeed, your dataset can still have issues that might cause model training to fail. We recommend that you view your validation report even if validation and import were successful, and address any issues listed in the report for successful model training. After you have addressed the issues, create a new batch import job. 

**Accessing the validation report**

You can access the validation report any time after the validation completes using one of the following options:

1. After the validation completes and while the import job is in progress, in the top banner, choose **View validation report**.

1. After the import job completes, in the **Import events data** pane, choose the Job ID of the import job that just completed. 

#### Using the validation report
<a name="using-sdv-validation-report"></a>

The validation report page of your import job provides the details of this import job, a list of critical errors if any are found, a list of warnings about specific events (rows) in your dataset if found, and a brief summary of your dataset that includes information such as values that are not valid, and missing values for each variable.
+ **Import job details**

  Provides details of the import job. If your import job has failed or your dataset was partially imported, choose **Go to results file** to view the error logs of the events that failed to import. 
+ **Critical errors**

  Provides details of the most impactful issues in your dataset identified by SDV. All the issues listed in this pane are critical and you must address them before you proceed with import. If you try to import your dataset without addressing the critical issues, your import job might fail.

  To address the critical issues, follow the recommendations provided for each warning. After you have addressed all the issues listed in the Critical errors pane, create a new batch import job. 
+ **Data level warnings**

  Provides a summary of the warnings for specific events (rows) in your dataset. If the Data level warnings pane is populated, some of the events in your dataset failed validation and were not imported. 

  For each warning, the **Description** column displays the number of events that has the issue. And the **Sample event IDs** provides a partial list of sample event IDs you can use as a starting point to locate the rest of the events that have the issue. Use the **Recommendation** provided for the warning to fix the issue. Also use the error logs from your output file for additional information about the issue. The error logs are generated for all the events that failed batch import. To access error logs, in the **Import job details** pane, choose **Go to results file**. 
**Note**  
If more than 50% of the events (rows) in your dataset failed validation, the import job also fails. In this case, you must fix the data before you start a new import job. 
+ **Dataset summary** 

   Provides a summary of the validation report of your dataset. If the Number of warnings column shows more than 0 warnings, decide if you need to fix those warning. If the **Number of warnings** column shows 0s, continue to train your model. 

## Batch import event data using the AWS SDK for Python (Boto3)
<a name="batch-import-data-sdk"></a>

The following example shows a sample request for [CreateBatchImportJob](https://docs.aws.amazon.com//frauddetector/latest/api/API_CreateBatchImportJob.html) API. A batch import job must include a **jobID**, **inputPath**, **outputPath**, **eventTypeName** and **iamRoleArn**. The jobID can’t contain the same ID of a past job, unless the job exists in CREATE\$1FAILED state. The inputPath and outputPath must be valid S3 paths. You can opt out of specifying the file name in the outputPath, however, you will still need to provide a valid S3 bucket location. The eventTypeName and iamRoleArn must exist. The IAM role must grant read permissions to input Amazon S3 bucket and write permissions to output Amazon S3 bucket. 

```
import boto3
fraudDetector = boto3.client('frauddetector')

fraudDetector.create_batch_import_job (
jobId = 'sample_batch_import',
inputPath = 's3://bucket_name/input_file_name.csv',
outputPath = 's3://bucket_name/',
eventTypeName = 'sample_registration',
iamRoleArn: 'arn:aws:iam::************:role/service-role/AmazonFraudDetector-DataAccessRole-*************'
)
```

## Cancel batch import job
<a name="cancel-batch-import"></a>

You can cancel an in-progress batch import job at any time in the Amazon Fraud Detector console, using the `CancelBatchImportJob` API, or AWS SDK. 

**To cancel a batch import job in console,**

1. Open the AWS Console and sign in to your account, and navigate to Amazon Fraud Detector.

1. In the left navigation pane, choose **Events**.

1. Choose your event type.

1. Select **Stored events** tab.

1. In the **Import events data** pane, choose the job Id of an in-progress import job you want to cancel.

1. In the event job page, click **Actions** and select **Cancel events import**.

1. Choose **Stop events import** to cancel the batch import job.

### Canceling batch import job using the AWS SDK for Python (Boto3)
<a name="cancel-batch-import-sdk"></a>

The following example shows a sample request for the `CancelBatchImportJob` API. The cancel import job must include the job ID of an in-progress batch import job. 

```
import boto3
fraudDetector = boto3.client('frauddetector')
fraudDetector.cancel_batch_import_job (
    jobId = 'sample_batch'
)
```

# Store event data using the GetEventPredictions API operation
<a name="storing-events-geteventprediction-api"></a>

By default, all events sent to the `GetEventPrediction` API for evaluation are stored in Amazon Fraud Detector. This means that Amazon Fraud Detector will automatically store event data when you generate a prediction and use that data to update calculated variables in near-real time. You can disable data storage by navigating to the event type in the Amazon Fraud Detector console and setting **Event ingestion** OFF or updating the EventIngestion value to DISABLED using the `PutEventType` API operation. For more information about the `GetEventPrediction` API operation, see [Fraud predictions](getting-fraud-predictions.md). 

**Important**  
We highly recommend that once you enable *Event ingestion* for an Event type, keep it enabled. Disabling the Event ingestion for the same Event type and then generating predictions might result in inconsistent behavior.

# Store event data using the SendEvent API operation
<a name="storing-events-sendevent-api"></a>

You can use the `SendEvent` API operation to store events in Amazon Fraud Detector without generating fraud predictions for those events. For example, you can use the `SendEvent` operation to upload a historical dataset, which you can later use to train a model.

**Event Timestamp formats for SendEvent API**

When storing event data using `SendEvent` API, you must ensure that your event timestamp is in the required format. Amazon Fraud Detector supports the following date/timestamp formats:
+ %yyyy-%mm-%ddT%hh:%mm:%ssZ (ISO 8601 standard in UTC only with no milliseconds)

  Example: 2019-11-30T13:01:01Z 
+ %yyyy/%mm/%dd %hh:%mm:%ss (AM/PM)

  Examples: 2019/11/30 1:01:01 PM, or 2019/11/30 13:01:01 
+ %mm/%dd/%yyyy %hh:%mm:%ss

  Examples: 11/30/2019 1:01:01 PM, 11/30/2019 13:01:01 
+ %mm/%dd/%yy %hh:%mm:%ss

  Examples: 11/30/19 1:01:01 PM, 11/30/19 13:01:01 

Amazon Fraud Detector makes the following assumptions when parsing date/timestamp formats for event timestamps:
+ If you are using the ISO 8601 standard, it must be an exact match of the preceding specification
+ If you are using one of the other formats, there is additional flexibility:
  + For months and days, you can provide single or double digits. For example, 1/12/2019 is a valid date.
  + You do not need to include hh:mm:ss if you do not have them (that is, you can simply provide a date). You can also provide a subset of just the hour and minutes (for example, hh:mm). Just providing hour is not supported. Milliseconds are also not supported.
  + If you provide AM/PM labels, a 12-hour clock is assumed. If there is no AM/PM information, a 24-hour clock is assumed.
  + You can use “/” or “-” as delimiters for the date elements. “:” is assumed for the timestamp elements.

The following is an example `SendEvent` API call. 

```
import boto3
fraudDetector = boto3.client('frauddetector')

fraudDetector.send_event(
            eventId        = '802454d3-f7d8-482d-97e8-c4b6db9a0428',
            eventTypeName  = 'sample_registration',
            eventTimestamp = '2020-07-13T23:18:21Z',
            eventVariables =  {
    			'email_address' : 'johndoe@exampledomain.com',
    			'ip_address' : '1.2.3.4'},
            assignedLabel  = ‘legit’,
            labelTimestamp = '2020-07-13T23:18:21Z',
            entities       = [{'entityType':'sample_customer', 'entityId':'12345'}],        
)
```

# Get details of a stored event data
<a name="get-stored-event-details"></a>

After you store event data in Amazon Fraud Detector, you can check the latest data that was stored for an event using the [GetEvent](https://docs.aws.amazon.com//frauddetector/latest/api/API_GetEvent.html) API. The following example code checks the latest data stored for the `sample_registration` event.

```
import boto3
fraudDetector = boto3.client('frauddetector')

fraudDetector.get_event(
            eventId        = '802454d3-f7d8-482d-97e8-c4b6db9a0428',
            eventTypeName  = 'sample_registration'
)
```

# View metrics of stored event dataset
<a name="view-stored-event-metrics"></a>

For each event type, you can view metrics such as, number of stored events, total size of your stored events, and timestamps of the earliest and the latest stored events, in the Amazon Fraud Detector console. 

**To view stored event metrics of an event type,**

1. Open the AWS Console and sign in to your account. Navigate to Amazon Fraud Detector.

1. In the left navigation pane, choose **Events**.

1. Choose your event type.

1. Select **Stored events** tab.

1. The **Stored events details** pane displays the metrics. These metrics are automatically updated once per day.

1. Optionally click **Refresh event metrics** to manually update your metrics. 
**Note**  
If you have just imported your data, we recommend waiting 5 - 10 minutes after you have finished importing data to refresh and view metrics.