

Amazon Fraud Detector is no longer open to new customers as of November 7, 2025. For capabilities similar to Amazon Fraud Detector, explore Amazon SageMaker, AutoGluon, and AWS WAF.

# Get and upload example dataset
<a name="step-1-get-s3-data"></a>

The example dataset you use in this tutorial provides details of online account registrations. The dataset is in a text file that uses comma-separated value (CSV) in the UTF-8 format. The first row of the CSV dataset file contains the headers. The header row is followed by multiple rows of data. Each of these rows consists of data elements from a single account registration. The data is labeled for your convenience. A column in the dataset identifies whether the account registration is fraudulent.

**To get and upload example dataset**

1. Go to [Samples](https://github.com/aws-samples/aws-fraud-detector-samples/tree/master/data).

    There are two data files that has online account registration data - *registration\$1data\$120K\$1minimum.csv* and *registration\$1data\$120K\$1full.csv*. The file `registration_data_20K_minimum` contains only two variables: *ip\$1address* and *email\$1address*. The file `registration_data_20K_full` contains other variables. These variables are for each event and they include *billing\$1address*, *phone\$1number*, and *user\$1agent*. Both data files also contain two mandatory fields: 
   + EVENT\$1TIMESTAMP – Defines when the event occurred
   + EVENT\$1LABEL – Classifies the event as fraudulent or legitimate

   You can use either one of the two files for this tutorial. Download the data file you want to use.

1. Create an Amazon Simple Storage Service (Amazon S3) bucket.

   In this step, you create an external storage to store the dataset. This external storage is Amazon S3 bucket. For more information about Amazon S3, see [What is Amazon S3?](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html)

   1. Sign in to the AWS Management Console and open the Amazon S3 console at [https://console.aws.amazon.com/s3/](https://console.aws.amazon.com/s3/).

   1. In **Buckets**, choose **Create bucket**.

   1. For **Bucket name**, enter a bucket name. Make sure that you follow the bucket naming rules in the console, and provide a globally unique name. We recommend you use a name that describes the bucket's purpose.

   1. For **AWS Region**, choose the AWS Region where you want to create your bucket. The Region that you choose must support Amazon Fraud Detector. To reduce latency, choose the AWS Region that's closest to your geographic location. For a list of Regions that support Amazon Fraud Detector, see the [Region Table](https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/) in the *Global Infrastructure Guide*.

   1. Leave the default settings for **Object Ownership**, **Bucket settings for Block Public Access**, **Bucket Versioning**, and **Tags** for this tutorial.

   1. For **Default encryption**, choose **Disable** for this tutorial.

   1. Review your bucket configuration, and then choose **Create bucket**.

1. Upload example data file to Amazon S3 bucket.

   Now that you have a bucket, upload one of the example files that you downloaded previously to the Amazon S3 bucket that you just created.

   1. In the **Buckets**, your bucket name is listed. Choose your bucket.

   1. Choose **Upload**.

   1. In **Files and folders**, choose **Add files**.

   1. Choose one of the example data files that you downloaded on your computer, and then choose **Open**.

   1. Leave the default settings for **Destination**, **Permissions**, and **Properties**.

   1. Review configurations, and then choose **Upload**.

   1. The example data file is uploaded to Amazon S3 bucket. Make a note of the bucket location. In the **Objects**, choose the example data file that you just uploaded.

   1. In the **Object overview**, copy the location under **S3 URI**. This is the Amazon S3 location of your example data file. You use it later. You can additionally copy the **Amazon Resource Name (ARN)** of your S3 bucket and save it.