

# Creating an Amazon Rekognition Custom Labels model
Creating a model

A model is the software that you train to find the concepts, scenes, and objects that are unique to your business. You can create a model with the Amazon Rekognition Custom Labels console or with the AWS SDK. Before creating an Amazon Rekognition Custom Labels model, we recommend that you read [Understanding Amazon Rekognition Custom Labels](understanding-custom-labels.md). 

This section provides console and SDK information about creating a project, creating training and test datasets for different model types, and training a model. Later sections show you how to improve and use your model. For a tutorial that shows you how to create and use a specific type of model with the console, see [Classifying images](tutorial-classification.md). 

**Topics**
+ [

# Creating a project
](mp-create-project.md)
+ [

# Creating training and test datasets
](creating-datasets.md)
+ [

# Training an Amazon Rekognition Custom Labels model
](training-model.md)
+ [

# Debugging a failed model training
](tm-debugging.md)

# Creating a project


A project manages the model versions, training dataset, and test dataset for a model. You can create a project with the Amazon Rekognition Custom Labels console or with the API. For other project tasks, such deleting a project, see [Managing an Amazon Rekognition Custom Labels project](managing-project.md).

 You can use tags to categorize and manage your Amazon Rekognition Custom Labels resources, including your projects. 

 The [CreateProject](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_CreateProject) operation allows you to optionally specify tags when creating a new project, providing the Tags as key-value pairs that you can use to categorize and manage your resources. 

## Creating an Amazon Rekognition Custom Labels Project (Console)
Creating a Project (Console)

You can use the Amazon Rekognition Custom Labels console to create a project. The first time you use the console in a new AWS Region, Amazon Rekognition Custom Labels asks to create an Amazon S3 bucket (console bucket) in your AWS account. The bucket is used to store project files. You can't use the Amazon Rekognition Custom Labels console unless the console bucket is created.

You can use the Amazon Rekognition Custom Labels console to create a project. 

**To create a project (console)**

1. Sign in to the AWS Management Console and open the Amazon Rekognition console at [https://console.aws.amazon.com/rekognition/](https://console.aws.amazon.com/rekognition/).

1. In the left pane, choose **Use Custom Labels**. The Amazon Rekognition Custom Labels landing page is shown.

1. The Amazon Rekognition Custom Labels landing page, choose **Get started**.

1. In the left pane, Choose **Projects**. 

1. Choose **Create Project**. 

1. In **Project name**, enter a name for your project. 

1. Choose **Create project** to create your project. 

1. Follow the steps in [Creating training and test datasets](creating-datasets.md) to create the training and test datasets for your project.

## Creating an Amazon Rekognition Custom Labels project (SDK)
Creating a project (SDK)

You create an Amazon Rekognition Custom Labels project by calling [CreateProject](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_CreateProject). The response is an Amazon Resource Name (ARN) that identifies the project. After you create a project, you create datasets for training and testing a model. For more information, see [Creating training and test datasets with images](md-create-dataset.md). 

**To create a project (SDK)**

1. If you haven't already done so, install and configure the AWS CLI and the AWS SDKs. For more information, see [Step 4: Set up the AWS CLI and AWS SDKs](su-awscli-sdk.md).

1. Use the following code to create a project. 

------
#### [ AWS CLI ]

   The following example creates a project and displays its ARN.

   Change the value of `project-name` to the name of the project that you want to create.

   ```
   aws rekognition create-project --project-name my_project \
    --profile custom-labels-access --"CUSTOM_LABELS" --tags'{"key1":"value1","key2":"value2"}'
   ```

------
#### [ Python ]

   The following example creates a project and displays its ARN. Supply the following command line arguments:
   + `project_name` – the name of the project you want to create.

   ```
   # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
   # SPDX-License-Identifier: Apache-2.0
   
   import argparse
   import logging
   import boto3
   
   from botocore.exceptions import ClientError
   
   logger = logging.getLogger(__name__)
   
   def create_project(rek_client, project_name):
       """
       Creates an Amazon Rekognition Custom Labels project
       :param rek_client: The Amazon Rekognition Custom Labels Boto3 client.
       :param project_name: A name for the new prooject.
       """
   
       try:
           #Create the project.
           logger.info("Creating project: %s",project_name)
           
           response=rek_client.create_project(ProjectName=project_name)
           
           logger.info("project ARN: %s",response['ProjectArn'])
   
           return response['ProjectArn']
      
       
       except ClientError as err:  
           logger.exception("Couldn't create project - %s: %s", project_name, err.response['Error']['Message'])
           raise
   
   def add_arguments(parser):
       """
       Adds command line arguments to the parser.
       :param parser: The command line parser.
       """
   
       parser.add_argument(
           "project_name", help="A name for the new project."
       )
   
   
   def main():
   
       logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s")
   
       try:
   
           # Get command line arguments.
           parser = argparse.ArgumentParser(usage=argparse.SUPPRESS)
           add_arguments(parser)
           args = parser.parse_args()
   
           print(f"Creating project: {args.project_name}")
   
           # Create the project.
           session = boto3.Session(profile_name='custom-labels-access')
           rekognition_client = session.client("rekognition")
   
           project_arn=create_project(rekognition_client, 
               args.project_name)
   
           print(f"Finished creating project: {args.project_name}")
           print(f"ARN: {project_arn}")
   
       except ClientError as err:
           logger.exception("Problem creating project: %s", err)
           print(f"Problem creating project: {err}")
   
   
   if __name__ == "__main__":
       main()
   ```

------
#### [ Java V2 ]

   The following example creates a project and displays its ARN.

   Supply the following command line argument:
   + `project_name` – the name of the project you want to create.

   ```
   /*
      Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
      SPDX-License-Identifier: Apache-2.0
   */
   package com.example.rekognition;
   
   import software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider;
   import software.amazon.awssdk.regions.Region;
   import software.amazon.awssdk.services.rekognition.RekognitionClient;
   import software.amazon.awssdk.services.rekognition.model.CreateProjectRequest;
   import software.amazon.awssdk.services.rekognition.model.CreateProjectResponse;
   import software.amazon.awssdk.services.rekognition.model.RekognitionException;
   
   import java.util.logging.Level;
   import java.util.logging.Logger;
   
   public class CreateProject {
   
       public static final Logger logger = Logger.getLogger(CreateProject.class.getName());
   
       public static String createMyProject(RekognitionClient rekClient, String projectName) {
   
           try {
   
               logger.log(Level.INFO, "Creating project: {0}", projectName);
               CreateProjectRequest createProjectRequest = CreateProjectRequest.builder().projectName(projectName).build();
   
               CreateProjectResponse response = rekClient.createProject(createProjectRequest);
   
               logger.log(Level.INFO, "Project ARN: {0} ", response.projectArn());
   
               return response.projectArn();
   
           } catch (RekognitionException e) {
               logger.log(Level.SEVERE, "Could not create project: {0}", e.getMessage());
               throw e;
           }
   
       }
   
       public static void main(String[] args) {
   
           final String USAGE = "\n" + "Usage: " + "<project_name> <bucket> <image>\n\n" + "Where:\n"
                   + "   project_name - A name for the new project\n\n";
   
           if (args.length != 1) {
               System.out.println(USAGE);
               System.exit(1);
           }
   
           String projectName = args[0];
           String projectArn = null;
           ;
   
           try {
   
               // Get the Rekognition client.
               RekognitionClient rekClient = RekognitionClient.builder()
                   .credentialsProvider(ProfileCredentialsProvider.create("custom-labels-access"))
                   .region(Region.US_WEST_2)
                   .build();
   
               // Create the project
               projectArn = createMyProject(rekClient, projectName);
   
               System.out.println(String.format("Created project: %s %nProject ARN: %s", projectName, projectArn));
   
               rekClient.close();
   
           } catch (RekognitionException rekError) {
               logger.log(Level.SEVERE, "Rekognition client error: {0}", rekError.getMessage());
               System.exit(1);
           }
   
       }
   
   }
   ```

------

1. Note the name of the project ARN that's displayed in the response. You'll need it to create a model. 

1. Follow the steps in [Create training and test datasets (SDK)](md-create-dataset.md#cd-create-dataset-sdk) to create the training and test datasets for your project.

## CreateProject operation request
Create project request format

 The following is the foramt of the CreateProject operation request: 

```
{
 "AutoUpdate": "string",
 "Feature": "string", 
 "ProjectName": "string",
 "Tags": {
 "string": "string"
 }
}
```

# Creating training and test datasets
Creating datasets



A dataset is a set of images and labels that describe those images. Your project needs a training dataset and a test dataset. Amazon Rekognition Custom Labels uses the training dataset to train your model. After training, Amazon Rekognition Custom Labels uses the test dataset to verify how well the trained model predicts the correct labels.

You can create datasets with the Amazon Rekognition Custom Labels console or with the AWS SDK. Before creating a dataset, we recommend reading [Understanding Amazon Rekognition Custom Labels](understanding-custom-labels.md). For other dataset tasks, see [Managing datasets](managing-dataset.md).

The steps creating training and tests datasets for a project are:

**To create training and test datasets for your project**

1. Determine how you need to label your training and test datasets. For more information, [Purposing datasets](md-dataset-purpose.md).

1. Collect the images for your training and test datasets. For more information, see [Preparing images](md-prepare-images.md).

1. Create the training and test datasets. For more information, see [Creating training and test datasets with images](md-create-dataset.md). If you're using the AWS SDK, see [Create training and test datasets (SDK)](md-create-dataset.md#cd-create-dataset-sdk).

1. If necesessary, add image-level labels or bounding boxes to your dataset images. For more information, see [Labeling images](md-labeling-images.md).

After you create the datasets, you can [train](training-model.md) the model.

**Topics**
+ [

# Purposing datasets
](md-dataset-purpose.md)
+ [

# Preparing images
](md-prepare-images.md)
+ [

# Creating training and test datasets with images
](md-create-dataset.md)
+ [

# Labeling images
](md-labeling-images.md)
+ [

# Debugging datasets
](debugging-datasets.md)

# Purposing datasets


How you label the training and test datasets in your project determines the type of model that you create. With Amazon Rekognition Custom Labels you can create models that do the following.
+ [Find objects, scenes, and concepts](#md-dataset-purpose-classification)
+ [Find object locations](#md-dataset-purpose-localization)
+ [Find brand locations](#md-dataset-purpose-brands)

## Find objects, scenes, and concepts


The model classifies the objects, scenes, and concepts that are associated with an entire image.

You can create two types of classification model, *image classification* and *multi-label classification*. For both types of classification model, the model finds one or more matching labels from the complete set of labels used for training. The training and test datasets both require at least two labels. 

### Image classification


 

The model classifies images as belonging to a set of predefined labels. For example, you might want a model that determines if an image contains a living space. The following image might have a *living\$1space* image-level label. 

![\[Cozy living room with fireplace, large windows overlooking backyard patio. Neutral tones, wooden accents.\]](http://docs.aws.amazon.com/rekognition/latest/customlabels-dg/images/living_space1.jpeg)


For this type of model, add a single image-level label to each of the training and test dataset images. For an example project, see [Image classification](getting-started.md#gs-image-classification-example).

### Multi-label classification


The model classifies images into multiple categories, such as the type of flower and whether it has leaves, or not. For example, the following image might have *mediterranean\$1spurge* and *no\$1leaves* image level labels.

![\[Close-up of a green viburnum flower cluster with tightly packed small florets.\]](http://docs.aws.amazon.com/rekognition/latest/customlabels-dg/images/mediterranean_spurge3.jpg)


For this type of model assign image-level labels for each category to the training and test dataset images. For an example project, see [Multi-label image classification](getting-started.md#gs-multi-label-image-classification-example).

### Assigning image-level labels


If your images are stored in an Amazon S3 bucket, you can use [folder names](md-create-dataset-s3.md) to automatically add image-level labels. For more information, see [Importing images from an Amazon S3 bucket](md-create-dataset-s3.md). You can also add image-level labels to images after you create a dataset, For more information, see [Assigning image-level labels to an image](md-assign-image-level-labels.md). You can add new labels as you need them. For more information, see [Managing labels](md-labels.md).

## Find object locations


To create a model that predicts the location of objects in your images, you define object location bounding boxes and labels for the images in your training and test datasets. A bounding box is a box that tightly surrounds an object. For example, the following image shows bounding boxes around an Amazon Echo and an Amazon Echo Dot. Each bounding box has an assigned label (*Amazon Echo* or *Amazon Echo Dot*).

![\[Two Amazon smart speakers, one with green bounding box and one blue bounding box, on a wooden surface.\]](http://docs.aws.amazon.com/rekognition/latest/customlabels-dg/images/echos.png)


To find object locations, your datasets needs at least one label. During model training, a further label is automatically created that represents the area outside of the bounding boxes on an image. 

### Assigning bounding boxes


 When you create your dataset, you can include bounding box information for your images. For example, you can import a SageMaker AI Ground Truth format [manifest file](md-create-manifest-file.md) that contains bounding boxes. You can also add bounding boxes after you create a dataset. For more information, see [Labeling objects with bounding boxes](md-localize-objects.md). You can add new labels as you need them. For more information, see [Managing labels](md-labels.md).

## Find brand locations


If you want to find the location of brands, such as logos and animated characters, you can use two different types of images for your training dataset images. 
+  Images that are of the logo only. Each image needs a single image-level label that represents the logo name. For example, the image-level label for the following image could be *Lambda*.  
![\[Lambda logo in white on an orange background.\]](http://docs.aws.amazon.com/rekognition/latest/customlabels-dg/images/lambda-logo.jpg)
+ Images that contain the logo in natural locations, such as a football game or an architectual diagram. Each training image needs bounding boxes that surround each instance of the logo. For example, the following image shows an architectural diagram with labeled bounding boxes surrounding the AWS Lambda and Amazon Pinpoint logos.   
![\[Diagrom workflow showing AWS Lambda service feeding user activity into Amazon Pinpoint for recommendations.\]](http://docs.aws.amazon.com/rekognition/latest/customlabels-dg/images/brand-detection-lambda.png)

We recommend that you don't mix image-level labels and bounding boxes in your training images. 

The test images must have bounding boxes around instances of the brand that you want to find. You can split the training dataset to create the test dataset, only if the training images include labeled bounding boxes. If the training images only have image-level labels, you must create a test dataset set that includes images with labeled bounding boxes. If you train a model to find brand locations, do [Labeling objects with bounding boxes](md-localize-objects.md) and [Assigning image-level labels to an image](md-assign-image-level-labels.md) according to how you label your images. 

The [Brand detection](getting-started.md#gs-brand-detection-example) example project shows how Amazon Rekognition Custom Labels uses labeled bounding boxes to train a model that finds object locations.

## Label requirements for model types


Use the following table to determine how to label your images. 

You can combine image-level labels and bounding box labeled images in a single dataset. In this case, Amazon Rekognition Custom Labels chooses whether to create an image-level model or an object location model. 


| Example | Training images | Test images | 
| --- | --- | --- | 
|  [Image classification](#md-dataset-image-classification)  |  1 Image-level label per image  |  1 Image-level label per image   | 
|  [Multi-label classification](#md-dataset-image-classification-multi-label)  |  Multiple image-level labels per image  |  Multiple image-level labels per image  | 
|  [Find brand locations](#md-dataset-purpose-brands)  |  image level-labels (you can also use Labeled bounding boxes)  |  Labeled bounding boxes  | 
|  [Find object locations](#md-dataset-purpose-localization)  |  Labeled bounding boxes  |  Labeled bounding boxes  | 

# Preparing images


 The images in your training and test dataset contain the objects, scenes, or concepts that you want your model to find. 

The content of images should be in a variety of backgrounds and lighting that represent the images that you want the trained model to identify.

This section provides information about the images in your training and test dataset.

## Image format


You can train Amazon Rekognition Custom Labels models with images that are in PNG and in JPEG format. Similarly, to detect custom labels using `DetectCustomLabels`, you need images that are in PNG and JPEG format.

## Input image recommendations


Amazon Rekognition Custom Labels requires images to train and test your model. To prepare your images, consider following:
+ Choose a specific domain for the model you want to create. For example, you could choose a model for scenic views and another model for objects such as machine parts. Amazon Rekognition Custom Labels works best if your images are in the chosen domain.
+ Use at least 10 images to train your model.
+ Images must be in PNG or JPEG format.
+ Use images that show the object in a variety of lightings, backgrounds, and resolutions.
+ Training and testing images should be similar to the images that you want to use the model with. 
+ Decide what labels to assign to the images.
+ Ensure that images are sufficiently large in terms of resolution. For more information, see [Guidelines and quotas in Amazon Rekognition Custom Labels](limits.md).
+ Ensure that occlusions don't obscure objects that you want to detect.
+ Use images that have sufficient contrast with the background. 
+ Use images that are bright and sharp. Avoid using images that may be blurry due to subject and camera motion as much as possible.
+ Use an image where the object occupies a large proportion of the image.
+ Images in your test dataset shouldn't be images that are in the training dataset. They should include the objects, scenes, and concepts that the model is trained to analyze.

## Image set size


Amazon Rekognition Custom Labels uses a set of images to train a model. At a minimum, you should use at least 10 images for training. Amazon Rekognition Custom Labels stores training and testing images in datasets. For more information, see [Creating training and test datasets with images](md-create-dataset.md).

# Creating training and test datasets with images
Creating datasets with images

You can start with a project that has a single dataset, or a project that has separate training and test datasets. If you start with a single dataset, Amazon Rekognition Custom Labels splits your dataset during training to create a training dataset (80%) and a test dataset (%20) for your project. Start with a single dataset if you want Amazon Rekognition Custom Labels to decide where images are used for training and testing. For complete control over training, testing, and performance tuning, we recommend that you start your project with separate training and test datasets. 

You can create training and test datasets for a project by importing images from one of the following locations:
+ [Importing images from an Amazon S3 bucket](md-create-dataset-s3.md)
+ [Importing images from a local computer](md-create-dataset-computer.md)
+ [Using a manifest file to import images](md-create-dataset-ground-truth.md)
+ [Copying content from an existing dataset](md-create-dataset-existing-dataset.md)

If you start your project with separate training and test datasets, you can use different source locations for each dataset.

Depending on where you import your images from, your images might be unlabeled. For example, images imported from a local computer aren't labeled. Images imported from an Amazon SageMaker AI Ground Truth manifest file are labeled. You can use the Amazon Rekognition Custom Labels console to add, change, and assign labels. For more information, see [Labeling images](md-labeling-images.md).

If images are uploading with errors, images are missing, or labels are missing from images, read [Debugging a failed model training](tm-debugging.md).

For more information about datasets, see [Managing datasets](managing-dataset.md).

## Create training and test datasets (SDK)
Creating datasets (SDK)

You can use the AWS SDK to create training and test datasets.

The `CreateDataset` operation allows you to optionally specify tags when creating a new dataset, for the purposes of categorizing and managing your resources. 

### Training dataset


You can use the AWS SDK to create a training dataset in the following ways.
+ Use [CreateDataset](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_CreateDataset) with an Amazon Sagemaker format manifest file that you provide. For more information, see [Creating a manifest file](md-create-manifest-file.md). For example code, see [Creating a dataset with a SageMaker AI Ground Truth manifest file (SDK)](md-create-dataset-ground-truth.md#md-create-dataset-ground-truth-sdk).
+ Use `CreateDataset` to copy an existing Amazon Rekognition Custom Labels dataset. For example code, see [Creating a dataset using an existing dataset (SDK)](md-create-dataset-existing-dataset-sdk.md).
+ Create an empty dataset with `CreateDataset` and add dataset entries at a later time with [UpdateDatasetEntries](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_UpdateDatasetEntries). To create an empty dataset, see [Adding a dataset to a project](md-add-dataset.md). To add images to a dataset, see [Adding more images (SDK)](md-add-images.md#md-add-images-sdk). You need to add the dataset entries before you can train a model.

### Test dataset


You can use the AWS SDK to create a test dataset in the following ways:
+ Use [CreateDataset](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_CreateDataset) with an Amazon Sagemaker format manifest file that you provide. For more information, see [Creating a manifest file](md-create-manifest-file.md). For example code, see [Creating a dataset with a SageMaker AI Ground Truth manifest file (SDK)](md-create-dataset-ground-truth.md#md-create-dataset-ground-truth-sdk).
+ Use `CreateDataset` to copy an existing Amazon Rekognition Custom Labels dataset. For example code, see [Creating a dataset using an existing dataset (SDK)](md-create-dataset-existing-dataset-sdk.md).
+ Create an empty dataset with `CreateDataset` and add dataset entries at a later time with `UpdateDatasetEntries`. To create an empty dataset, see [Adding a dataset to a project](md-add-dataset.md). To add images to a dataset, see [Adding more images (SDK)](md-add-images.md#md-add-images-sdk). You need to add the dataset entries before you can train a model.
+ Split the training dataset into separate training and test datasets. First create an empty test dataset with `CreateDataset`. Then move 20% of the training dataset entries into the test dataset by calling [DistributeDatasetEntries](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_DistributeDatasetEntries). To create an empty dataset, see [Adding a dataset to a project (SDK)](md-add-dataset.md#md-add-dataset-sdk). To split the training dataset, see [Distributing a training dataset (SDK)](md-distributing-datasets.md).

# Importing images from an Amazon S3 bucket


The images are imported from an Amazon S3 bucket. You can use the console bucket, or another Amazon S3 bucket in your AWS account. If you are using the console bucket, the required permissions are already set up. If you are not using the console bucket, see [Accessing external Amazon S3 Buckets](su-console-policy.md#su-external-buckets).

**Note**  
You can't use the AWS SDK to create a dataset directly from images in an Amazon S3 bucket. Instead, create a manifest file that references the source locations of the images. For more information, see [Using a manifest file to import images](md-create-dataset-ground-truth.md)

During dataset creation, you can choose to assign label names to images based on the name of the folder that contains the images. The folder(s) must be a child of the Amazon S3 folder path that you specify in **S3 folder location** during dataset creation. To create a dataset, see [Creating a dataset by importing images from an S3 bucket](#cd-procedure).

For example, assume the following folder structure in an Amazon S3 bucket. If you specify the Amazon S3 folder location as *S3-bucket/alexa-devices*, the images in the folder *echo* are assigned the label *echo*. Similarly, images in the folder *echo-dot* are assigned the label *echo-dot*. The names of deeper child folders aren't used to label images. Instead, the appropriate child folder of the Amazon S3 folder location is used. For example, images in the folder *white-echo-dots* are assigned the label *echo-dot*. Images at the level of the S3 folder location (*alexa-devices*) don't have labels assigned to them.

 Folders deeper in the folder structure can be used to label images by specifying a deeper S3 folder location. For example, If you specify *S3-bucket/alexa-devices/echo-dot*, Images in the folder *white-echo-dot* are labeled *white-echo-dot*. Images outside the specified s3 folder location, such as *echo*, aren't imported.

```
S3-bucket
└── alexa-devices
    ├── echo
    │   ├── echo-image-1.png
    │   └── echo-image-2.png
    │   ├── .
    │   └── .
    └── echo-dot
        ├── white-echo-dot
        │   ├── white-echo-dot-image-1.png
        │   ├── white-echo-dot-image-2.png
        │
        ├── echo-dot-image-1.png
        ├── echo-dot-image-2.png
        ├── .
        └── .
```

We recommend that you use the Amazon S3 bucket (console bucket) created for you by Amazon Rekognition when you first opened the console in the current AWS region. If the Amazon S3 bucket that you are using is different (external) to the console bucket, the console prompts you to set up appropriate permissions during dataset creation. For more information, see [Step 2: Set up Amazon Rekognition Custom Labels console permissions](su-console-policy.md). 

## Creating a dataset by importing images from an S3 bucket


The following procedure shows you how to create a dataset using images stored in the Console S3 bucket. The images are automatically labeled with the name of the folder in which they are stored. 

After you have imported your images, you can add more images, assign labels, and add bounding boxes from a dataset's gallery page. For more information, see [Labeling images](md-labeling-images.md).<a name="cd-upload-s3-bucket"></a>

**Upload your images to an Amazon Simple Storage Service bucket**

1. Create a folder on your local file system. Use a folder name such as *alexa-devices*.

1. Within the folder you just created, create folders named after each label that you want to use. For example, *echo* and *echo-dot*. The folder structure should be similar to the following.

   ```
   alexa-devices
   ├── echo
   │   ├── echo-image-1.png
   │   ├── echo-image-2.png
   │   ├── .
   │   └── .
   └── echo-dot
       ├── echo-dot-image-1.png
       ├── echo-dot-image-2.png
       ├── .
       └── .
   ```

1. Place the images that correspond to a label into the folder with the same label name.

1. Sign in to the AWS Management Console and open the Amazon S3 console at [https://console.aws.amazon.com/s3/](https://console.aws.amazon.com/s3/).

1. [Add the folder](https://docs.aws.amazon.com/AmazonS3/latest/userguide/upload-objects.html) you created in step 1 to the Amazon S3 bucket (console bucket) created for you by Amazon Rekognition Custom Labels during *First Time Set Up*. For more information, see [Managing an Amazon Rekognition Custom Labels project](managing-project.md).

1. Open the Amazon Rekognition console at [https://console.aws.amazon.com/rekognition/](https://console.aws.amazon.com/rekognition/).

1. Choose **Use Custom Labels**.

1. Choose **Get started**. 

1. In the left navigation pane, choose **Projects**.

1. In the **Projects** page, choose the project to which you want to add a dataset. The details page for your project is displayed.

1. Choose **Create dataset**. The **Create dataset** page is shown.

1. In **Starting configuration**, choose either **Start with a single dataset** or **Start with a training dataset**. To create a higher quality model, we recommend starting with separate training and test datasets.

------
#### [ Single dataset ]

   1. In the **Training dataset details** section, choose **Import images from S3 bucket**.

   1. In the **Training dataset details** section, Enter the information for steps 13 - 15 in the **Image source configuration** section. 

------
#### [ Separate training and test datasets ]

   1. In the **Training dataset details** section, choose **Import images from S3 bucket**.

   1. In the **Training dataset details** section, enter the information for steps 13 - 15 in the **Image source configuration** section. 

   1. In the **Test dataset details** section, choose **Import images from S3 bucket**.

   1. In the **Test dataset details** section, enter the information for steps 13 - 15 in the **Image source configuration** section. 

------

1. Choose **Import images from Amazon S3 bucket**.

1. In **S3 URI**, enter the Amazon S3 bucket location and folder path. 

1. Choose **Automatically attach labels to images based on the folder**.

1. Choose **Create Datasets**. The datasets page for your project opens.

1. If you need to add or change labels, do [Labeling images](md-labeling-images.md).

1. Follow the steps in [Training a model (Console)](training-model.md#tm-console) to train your model.

# Importing images from a local computer


The images are loaded directly from your computer. You can upload up to 30 images at a time.

The images you upload won't have labels associated with them. For more information, see [Labeling images](md-labeling-images.md). If you have many images to upload, consider using an Amazon S3 bucket. For more information, see [Importing images from an Amazon S3 bucket](md-create-dataset-s3.md).

**Note**  
You can't use the AWS SDK to create a dataset with local images. Instead, create a manifest file and upload the images to an Amazon S3 bucket. For more information, see [Using a manifest file to import images](md-create-dataset-ground-truth.md).

**To create a dataset using images on a local computer (console)**

1. Open the Amazon Rekognition console at [https://console.aws.amazon.com/rekognition/](https://console.aws.amazon.com/rekognition/).

1. Choose **Use Custom Labels**.

1. Choose **Get started**. 

1. In the left navigation pane, choose **Projects**.

1. In the **Projects** page, choose the project to which you want to add a dataset. The details page for your project is displayed.

1. Choose **Create dataset**. The **Create dataset** page is shown.

1. In **Starting configuration**, choose either **Start with a single dataset** or **Start with a training dataset**. To create a higher quality model, we recommend starting with separate training and test datasets.

------
#### [ Single dataset ]

   1. In the **Training dataset details section** section, choose **Upload images from your computer**.

   1. Choose **Create Dataset**. 

   1. On the project's dataset page, choose **Add images**. 

   1. Choose the images you want to upload into the dataset from your computer files. You can drag the images or choose the images that you want to upload from your local computer.

   1. Choose **Upload images**.

------
#### [ Separate training and test datasets ]

   1. In the **Training dataset details** section, choose **Upload images from your computer**.

   1. In the **Test dataset details** section, choose **Upload images from your computer**.
**Note**  
Your training and test datasets can have different image sources.

   1. Choose **Create Datasets**. Your project's datasets page appears with a **Training** tab and a **Test** tab for the respective datasets. 

   1. Choose **Actions** and then choose **Add images to training dataset**.

   1. Choose the images you want to upload to the dataset. You can drag the images or choose the images that you want to upload from your local computer.

   1. Choose **Upload images**.

   1. Repeat steps 5e - 5g. For step 5e, choose **Actions** and then choose **Add images to test dataset**.

------

1. Follow the steps in [Labeling images](md-labeling-images.md) to label your images.

1. Follow the steps in [Training a model (Console)](training-model.md#tm-console) to train your model.

# Using a manifest file to import images


You can create a dataset using an Amazon SageMaker AI Ground Truth format manifest file. You can use the manifest file from an Amazon SageMaker AI Ground Truth job. If your images and labels aren't in the format of a SageMaker AI Ground Truth manifest file, you can create a SageMaker AI format manifest file and use it to import your labeled images. 

The `CreateDataset` operation is updated to allow you to optionally specify tags when creating a new dataset. Tags are key-value pairs that you can use to categorize and manage your resources. 

**Topics**
+ [

## Creating a dataset with a SageMaker AI Ground Truth manifest file (Console)
](#md-create-dataset-ground-truth-console)
+ [

## Creating a dataset with a SageMaker AI Ground Truth manifest file (SDK)
](#md-create-dataset-ground-truth-sdk)
+ [

## Create dataset request
](#create-dataset-ground-truth-request)
+ [

# Labeling images with an Amazon SageMaker AI Ground Truth job
](md-create-dataset-ground-truth-job.md)
+ [

# Creating a manifest file
](md-create-manifest-file.md)
+ [

# Importing image-level labels in manifest files
](md-create-manifest-file-classification.md)
+ [

# Object localization in manifest files
](md-create-manifest-file-object-detection.md)
+ [

# Validation rules for manifest files
](md-create-manifest-file-validation-rules.md)
+ [

# Converting other dataset formats to a manifest file
](md-converting-to-sm-format.md)

## Creating a dataset with a SageMaker AI Ground Truth manifest file (Console)


The following procedure shows you how to create a dataset by using a SageMaker AI Ground Truth format manifest file. 

1. Create a manifest file for your training dataset by doing one of the following:
   + Create a manifest file with a SageMaker AI GroundTruth Job by following the instructions at [ Labeling images with an Amazon SageMaker AI Ground Truth job](md-create-dataset-ground-truth-job.md). 
   + Create your own manifest file by following the instructions at [Creating a manifest file](md-create-manifest-file.md). 

   If you want to create a test dataset, repeat step 1 to create the test dataset.

1. Open the Amazon Rekognition console at [https://console.aws.amazon.com/rekognition/](https://console.aws.amazon.com/rekognition/).

1. Choose **Use Custom Labels**.

1. Choose **Get started**. 

1. In the left navigation pane, choose **Projects**.

1. In the **Projects** page, choose the project to which you want to add a dataset. The details page for your project is displayed.

1. Choose **Create dataset**. The **Create dataset** page is shown.

1. In **Starting configuration**, choose either **Start with a single dataset** or **Start with a training dataset**. To create a higher quality model, we recommend starting with separate training and test datasets.

------
#### [ Single dataset ]

   1. In the **Training dataset details** section, choose **Import images labeled by SageMaker Ground Truth**.

   1. In **.manifest file location** enter the location of the manifest file that you created in step 1.

   1. Choose **Create Dataset**. The datasets page for your project opens.

------
#### [ Separate training and test datasets ]

   1. In the **Training dataset details** section, choose **Import images labeled by SageMaker Ground Truth**.

   1. In **.manifest file location** enter the location of the training dataset manifest file you created in step 1.

   1. In the **Test dataset details** section, choose **Import images labeled by SageMaker Ground Truth**.
**Note**  
Your training and test datasets can have different image sources.

   1. In **.manifest file location** enter the location of the test dataset manifest file you created in step 1.

   1. Choose **Create Datasets**. The datasets page for your project opens.

------

1. If you need to add or change labels, do [Labeling images](md-labeling-images.md).

1. Follow the steps in [Training a model (Console)](training-model.md#tm-console) to train your model.

## Creating a dataset with a SageMaker AI Ground Truth manifest file (SDK)


The following procedure shows you how to create training or test datasets from a manifest file by using the [CreateDataset](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_CreateDataset) API.

You can use an existing manifest file, such as the output from an [SageMaker AI Ground Truth job](md-create-dataset-ground-truth-job.md), or create your own [manifest file](md-create-manifest-file.md). 

1. If you haven't already done so, install and configure the AWS CLI and the AWS SDKs. For more information, see [Step 4: Set up the AWS CLI and AWS SDKs](su-awscli-sdk.md).

1. Create a manifest file for your training dataset by doing one of the following:
   + Create a manifest file with a SageMaker AI GroundTruth Job by following the instructions at [ Labeling images with an Amazon SageMaker AI Ground Truth job](md-create-dataset-ground-truth-job.md). 
   + Create your own manifest file by following the instructions at [Creating a manifest file](md-create-manifest-file.md). 

   If you want to create a test dataset, repeat step 2 to create the test dataset.

1. Use the following example code to create the training and test dataset.

------
#### [ AWS CLI ]

   Use the following code to create a dataset. Replace the following:
   + `project_arn` — the ARN of the project that you want to add the test dataset to.
   + `type` — the type of dataset that you want to create (TRAIN or TEST)
   + `bucket` — the bucket that contains the manifest file for the dataset.
   + `manifest_file` — the path and file name of the manifest file.

   ```
   aws rekognition create-dataset --project-arn project_arn \
     --dataset-type type \
     --dataset-source '{ "GroundTruthManifest": { "S3Object": { "Bucket": "bucket", "Name": "manifest_file" } } }' \
     --profile custom-labels-access
     --tags '{"key1": "value1", "key2": "value2"}'
   ```

------
#### [ Python ]

   Use the following values to create a dataset. Supply the following command line parameters:
   + `project_arn` — the ARN of the project that you want to add the test dataset to.
   + `dataset_type` — the type of dataset that you want to create (`train` or `test`).
   + `bucket` — the bucket that contains the manifest file for the dataset.
   + `manifest_file` — the path and file name of the manifest file.

   ```
   #Copyright 2023 Amazon.com, Inc. or its affiliates. All Rights Reserved.
   #PDX-License-Identifier: MIT-0 (For details, see https://github.com/awsdocs/amazon-rekognition-custom-labels-developer-guide/blob/master/LICENSE-SAMPLECODE.)
   
   
   import argparse
   import logging
   import time
   import json
   import boto3
   from botocore.exceptions import ClientError
   
   logger = logging.getLogger(__name__)
   
   def create_dataset(rek_client, project_arn, dataset_type, bucket, manifest_file):
       """
       Creates an Amazon Rekognition Custom Labels dataset.
       :param rek_client: The Amazon Rekognition Custom Labels Boto3 client.
       :param project_arn: The ARN of the project in which you want to create a dataset.
       :param dataset_type: The type of the dataset that you want to create (train or test).
       :param bucket: The S3 bucket that contains the manifest file.
       :param manifest_file: The path and filename of the manifest file.
       """
   
       try:
           #Create the project
           logger.info("Creating %s dataset for project %s",dataset_type, project_arn)
   
           dataset_type = dataset_type.upper()
   
           dataset_source = json.loads(
               '{ "GroundTruthManifest": { "S3Object": { "Bucket": "'
               + bucket
               + '", "Name": "'
               + manifest_file
               + '" } } }'
           )
   
           response = rek_client.create_dataset(
               ProjectArn=project_arn, DatasetType=dataset_type, DatasetSource=dataset_source
           )
   
           dataset_arn=response['DatasetArn']
   
           logger.info("dataset ARN: %s",dataset_arn)
   
           finished=False
           while finished is False:
   
               dataset=rek_client.describe_dataset(DatasetArn=dataset_arn)
   
               status=dataset['DatasetDescription']['Status']
               
               if status == "CREATE_IN_PROGRESS":
                   logger.info("Creating dataset: %s ",dataset_arn)
                   time.sleep(5)
                   continue
   
               if status == "CREATE_COMPLETE":
                   logger.info("Dataset created: %s", dataset_arn)
                   finished=True
                   continue
   
               if status == "CREATE_FAILED":
                   error_message = f"Dataset creation failed: {status} : {dataset_arn}"
                   logger.exception(error_message)
                   raise Exception (error_message)
                   
               error_message = f"Failed. Unexpected state for dataset creation: {status} : {dataset_arn}"
               logger.exception(error_message)
               raise Exception(error_message)
               
           return dataset_arn
      
       
       except ClientError as err:
           logger.exception("Couldn't create dataset: %s",err.response['Error']['Message'])
           raise
   
   def add_arguments(parser):
       """
       Adds command line arguments to the parser.
       :param parser: The command line parser.
       """
   
       parser.add_argument(
           "project_arn", help="The ARN of the project in which you want to create the dataset."
       )
   
       parser.add_argument(
           "dataset_type", help="The type of the dataset that you want to create (train or test)."
       )
   
       parser.add_argument(
           "bucket", help="The S3 bucket that contains the manifest file."
       )
       
       parser.add_argument(
           "manifest_file", help="The path and filename of the manifest file."
       )
   
   
   def main():
   
       logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s")
   
       try:
   
           #Get command line arguments.
           parser = argparse.ArgumentParser(usage=argparse.SUPPRESS)
           add_arguments(parser)
           args = parser.parse_args()
   
           print(f"Creating {args.dataset_type} dataset for project {args.project_arn}")
   
           #Create the dataset.
           session = boto3.Session(profile_name='custom-labels-access')
           rekognition_client = session.client("rekognition")
   
           dataset_arn=create_dataset(rekognition_client, 
               args.project_arn,
               args.dataset_type,
               args.bucket,
               args.manifest_file)
   
           print(f"Finished creating dataset: {dataset_arn}")
   
   
       except ClientError as err:
           logger.exception("Problem creating dataset: %s", err)
           print(f"Problem creating dataset: {err}")
   
   
   
   if __name__ == "__main__":
       main()
   ```

------
#### [ Java V2 ]

   Use the following values to create a dataset. Supply the following command line parameters:
   + `project_arn` — the ARN of the project that you want to add the test dataset to.
   + `dataset_type` — the type of dataset that you want to create (`train` or `test`).
   + `bucket` — the bucket that contains the manifest file for the dataset.
   + `manifest_file` — the path and file name of the manifest file.

   ```
   /*
      Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
      SPDX-License-Identifier: Apache-2.0
   */
   
   package com.example.rekognition;
   
   import software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider;
   import software.amazon.awssdk.regions.Region;
   import software.amazon.awssdk.services.rekognition.RekognitionClient;
   import software.amazon.awssdk.services.rekognition.model.CreateDatasetRequest;
   import software.amazon.awssdk.services.rekognition.model.CreateDatasetResponse;
   import software.amazon.awssdk.services.rekognition.model.DatasetDescription;
   import software.amazon.awssdk.services.rekognition.model.DatasetSource;
   import software.amazon.awssdk.services.rekognition.model.DatasetStatus;
   import software.amazon.awssdk.services.rekognition.model.DatasetType;
   import software.amazon.awssdk.services.rekognition.model.DescribeDatasetRequest;
   import software.amazon.awssdk.services.rekognition.model.DescribeDatasetResponse;
   import software.amazon.awssdk.services.rekognition.model.GroundTruthManifest;
   import software.amazon.awssdk.services.rekognition.model.RekognitionException;
   import software.amazon.awssdk.services.rekognition.model.S3Object;
   
   import java.util.logging.Level;
   import java.util.logging.Logger;
   
   public class CreateDatasetManifestFiles {
   
       public static final Logger logger = Logger.getLogger(CreateDatasetManifestFiles.class.getName());
   
       public static String createMyDataset(RekognitionClient rekClient, String projectArn, String datasetType,
               String bucket, String name) throws Exception, RekognitionException {
   
           try {
   
               logger.log(Level.INFO, "Creating {0} dataset for project : {1} from s3://{2}/{3} ",
                       new Object[] { datasetType, projectArn, bucket, name });
   
               DatasetType requestDatasetType = null;
   
               switch (datasetType) {
               case "train":
                   requestDatasetType = DatasetType.TRAIN;
                   break;
               case "test":
                   requestDatasetType = DatasetType.TEST;
                   break;
               default:
                   logger.log(Level.SEVERE, "Could not create dataset. Unrecognized dataset type: {0}", datasetType);
                   throw new Exception("Could not create dataset. Unrecognized dataset type: " + datasetType);
   
               }
   
               GroundTruthManifest groundTruthManifest = GroundTruthManifest.builder()
                       .s3Object(S3Object.builder().bucket(bucket).name(name).build()).build();
   
               DatasetSource datasetSource = DatasetSource.builder().groundTruthManifest(groundTruthManifest).build();
   
               CreateDatasetRequest createDatasetRequest = CreateDatasetRequest.builder().projectArn(projectArn)
                       .datasetType(requestDatasetType).datasetSource(datasetSource).build();
   
               CreateDatasetResponse response = rekClient.createDataset(createDatasetRequest);
   
               boolean created = false;
   
               do {
   
                   DescribeDatasetRequest describeDatasetRequest = DescribeDatasetRequest.builder()
                           .datasetArn(response.datasetArn()).build();
                   DescribeDatasetResponse describeDatasetResponse = rekClient.describeDataset(describeDatasetRequest);
   
                   DatasetDescription datasetDescription = describeDatasetResponse.datasetDescription();
   
                   DatasetStatus status = datasetDescription.status();
   
                   logger.log(Level.INFO, "Creating dataset ARN: {0} ", response.datasetArn());
   
                   switch (status) {
   
                   case CREATE_COMPLETE:
                       logger.log(Level.INFO, "Dataset created");
                       created = true;
                       break;
   
                   case CREATE_IN_PROGRESS:
                       Thread.sleep(5000);
                       break;
   
                   case CREATE_FAILED:
                       String error = "Dataset creation failed: " + datasetDescription.statusAsString() + " "
                               + datasetDescription.statusMessage() + " " + response.datasetArn();
                       logger.log(Level.SEVERE, error);
                       throw new Exception(error);
   
                   default:
                       String unexpectedError = "Unexpected creation state: " + datasetDescription.statusAsString() + " "
                               + datasetDescription.statusMessage() + " " + response.datasetArn();
                       logger.log(Level.SEVERE, unexpectedError);
                       throw new Exception(unexpectedError);
                   }
   
               } while (created == false);
   
               return response.datasetArn();
   
           } catch (RekognitionException e) {
               logger.log(Level.SEVERE, "Could not create dataset: {0}", e.getMessage());
               throw e;
           }
   
       }
   
       public static void main(String[] args) {
   
           String datasetType = null;
           String bucket = null;
           String name = null;
           String projectArn = null;
           String datasetArn = null;
   
           final String USAGE = "\n" + "Usage: " + "<project_arn> <dataset_type> <dataset_arn>\n\n" + "Where:\n"
                   + "   project_arn - the ARN of the project that you want to add copy the datast to.\n\n"
                   + "   dataset_type - the type of the dataset that you want to create (train or test).\n\n"
                   + "   bucket - the S3 bucket that contains the manifest file.\n\n"
                   + "   name - the location and name of the manifest file within the bucket.\n\n";
   
           if (args.length != 4) {
               System.out.println(USAGE);
               System.exit(1);
           }
   
           projectArn = args[0];
           datasetType = args[1];
           bucket = args[2];
           name = args[3];
   
           try {
   
               // Get the Rekognition client
               RekognitionClient rekClient = RekognitionClient.builder()
                   .credentialsProvider(ProfileCredentialsProvider.create("custom-labels-access"))
                   .region(Region.US_WEST_2)
                   .build();
   
   
                // Create the dataset
               datasetArn = createMyDataset(rekClient, projectArn, datasetType, bucket, name);
   
               System.out.println(String.format("Created dataset: %s", datasetArn));
   
               rekClient.close();
   
           } catch (RekognitionException rekError) {
               logger.log(Level.SEVERE, "Rekognition client error: {0}", rekError.getMessage());
               System.exit(1);
           } catch (Exception rekError) {
               logger.log(Level.SEVERE, "Error: {0}", rekError.getMessage());
               System.exit(1);
           }
   
       }
   
   }
   ```

------

1. If you need to add or change labels, see [Managing Labels (SDK)](md-labels.md#md-labels-sdk).

1. Follow the steps in [Training a model (SDK)](training-model.md#tm-sdk) to train your model.

## Create dataset request
Create dataset request

 The following is the foramt of the CreateDataset operation request: 

```
{
"DatasetSource": {
"DatasetArn": "string",
"GroundTruthManifest": {
"S3Object": {
"Bucket": "string",
"Name": "string",
"Version": "string"
}
}
},
"DatasetType": "string",
"ProjectArn": "string",
"Tags": {
"string": "string"
}
}
```

# Labeling images with an Amazon SageMaker AI Ground Truth job


With Amazon SageMaker AI Ground Truth, you can use workers from either Amazon Mechanical Turk, a vendor company that you choose, or an internal, private workforce along with machine learning that allows you to create a labeled set of images. Amazon Rekognition Custom Labels imports SageMaker AI Ground Truth manifest files from an Amazon S3 bucket that you specify.

Amazon Rekognition Custom Labels supports the following SageMaker AI Ground Truth tasks.
+ [Image Classification](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-image-classification.html)
+ [Bounding Box](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-bounding-box.html)

The files you import are the images and a manifest file. The manifest file contains label and bounding box information for the images you import.

Amazon Rekognition needs permissions to access the Amazon S3 bucket where your images are stored. If you are using the console bucket set up for you by Amazon Rekognition Custom Labels, the required permissions are already set up. If you are not using the console bucket, see [Accessing external Amazon S3 Buckets](su-console-policy.md#su-external-buckets).

## Creating a manifest file with a SageMaker AI Ground Truth job (Console)


The following procedure shows you how to create a dataset by using images labeled by a SageMaker AI Ground Truth job. The job output files are stored in your Amazon Rekognition Custom Labels console bucket.<a name="create-dataset-procedure-ground-truth"></a>

**To create a dataset using images labeled by a SageMaker AI Ground Truth job (console)**

1. Sign in to the AWS Management Console and open the Amazon S3 console at [https://console.aws.amazon.com/s3/](https://console.aws.amazon.com/s3/).

1. In the console bucket, [create a folder](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-folder.html) to hold your training images. 
**Note**  
The console bucket is created when you first open the Amazon Rekognition Custom Labels console in an AWS Region. For more information, see [Managing an Amazon Rekognition Custom Labels project](managing-project.md).

1. [Upload your images](https://docs.aws.amazon.com/AmazonS3/latest/userguide/upload-objects.html) to the folder that you just created.

1. In the console bucket, create a folder to hold the output of the Ground Truth job.

1. Open the SageMaker AI console at [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/).

1. Create a Ground Truth labeling job. You'll need the Amazon S3 URLs for the folders you created in step 2 and step 4. For more information, see [Use Amazon SageMaker Ground Truth for Data Labeling](https://docs.aws.amazon.com/sagemaker/latest/dg/sms.html). 

1. Note the location of the `output.manifest` file in the folder you created in step 4. It should be in the sub-folder `Ground-Truth-Job-Name/manifests/output`.

1. Follow the instructions at [Creating a dataset with a SageMaker AI Ground Truth manifest file (Console)](md-create-dataset-ground-truth.md#md-create-dataset-ground-truth-console) to create a dataset with the uploaded manifest file. For step 8, in **.manifest file location**, enter the Amazon S3 URL for the location you noted in the previous step. If you are using the AWS SDK, do [Creating a dataset with a SageMaker AI Ground Truth manifest file (SDK)](md-create-dataset-ground-truth.md#md-create-dataset-ground-truth-sdk).

1. Repeat steps 1 - 6 to create SageMaker AI Ground Truth job for your test dataset.

# Creating a manifest file


You can create a test or training dataset by importing a SageMaker AI Ground Truth format manifest file. If your images are labeled in a format that isn't a SageMaker AI Ground Truth manifest file, use the following information to create a SageMaker AI Ground Truth format manifest file. 

Manifest files are in [JSON lines](http://jsonlines.org) format where each line is a complete JSON object representing the labeling information for an image. Amazon Rekognition Custom Labels supports SageMaker AI Ground Truth manifests with JSON lines in the following formats:
+ [Classification Job Output](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-data-output.html#sms-output-class) – Use to add image-level labels to an image. An image-level label defines the class of scene, concept, or object (if object location information isn't needed) that's on an image. An image can have more that one image-level label. For more information, see [Importing image-level labels in manifest files](md-create-manifest-file-classification.md).
+ [Bounding Box Job Output](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-data-output.html#sms-output-box) – Use to label the class and location of one or more objects on an image. For more information, see [Object localization in manifest files](md-create-manifest-file-object-detection.md).

Image-level and localization (bounding-box) JSON lines can be chained together in the same manifest file. 

**Note**  
The JSON line examples in this section are formatted for readability. 

When you import a manifest file, Amazon Rekognition Custom Labels applies validation rules for limits, syntax, and semantics. For more information, see [Validation rules for manifest files](md-create-manifest-file-validation-rules.md). 

The images referenced by a manifest file must be located in the same Amazon S3 bucket. The manifest file can be located in a different Amazon S3 bucket than the Amazon S3 bucket that stores the images. You specify the location of an image in the `source-ref` field of a JSON line. 

Amazon Rekognition needs permissions to access the Amazon S3 bucket where your images are stored. If you are using the console bucket set up for you by Amazon Rekognition Custom Labels, the required permissions are already set up. If you are not using the console bucket, see [Accessing external Amazon S3 Buckets](su-console-policy.md#su-external-buckets).

**Topics**
+ [

## Creating a manifest file
](#md-create-manifest-file-console)

## Creating a manifest file


The following procedure creates a project with a training and test dataset. The datasets are created from training and test manifest files that you create.

<a name="create-dataset-procedure-manifest-file"></a>

**To create a dataset using a SageMaker AI Ground Truth format manifest file (console)**

1. In the console bucket, [create a folder](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-folder.html) to hold your manifest files. 

1. In the console bucket, create a folder to hold your images.

1. Upload your images to the folder you just created.

1. Create a SageMaker AI Ground Truth format manifest file for your training dataset. For more information, see [Importing image-level labels in manifest files](md-create-manifest-file-classification.md) and [Object localization in manifest files](md-create-manifest-file-object-detection.md).
**Important**  
The `source-ref` field value in each JSON line must map to an image that you uploaded.

1. Create an SageMaker AI Ground Truth format manifest file for your test dataset. 

1. [Upload your manifest files](https://docs.aws.amazon.com/AmazonS3/latest/userguide/upload-objects.html) to the folder that you just created.

1. Note the location of the manifest file.

1. Follow the instructions at [Creating a dataset with a SageMaker AI Ground Truth manifest file (Console)](md-create-dataset-ground-truth.md#md-create-dataset-ground-truth-console) to create a dataset with the uploaded manifest file. For step 8, in **.manifest file location**, enter the Amazon S3 URL for the location you noted in the previous step. If you are using the AWS SDK, do [Creating a dataset with a SageMaker AI Ground Truth manifest file (SDK)](md-create-dataset-ground-truth.md#md-create-dataset-ground-truth-sdk).

# Importing image-level labels in manifest files


To import image-level labels (images labeled with scenes, concepts, or objects that don't require localization information), you add SageMaker AI Ground Truth [Classification Job Output](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-data-output.html#sms-output-class) format JSON lines to a manifest file. A manifest file is made of one or more JSON lines, one for each image that you want to import. 

**Tip**  
To simplify creation of a manifest file, we provide a Python script that creates a manifest file from a CSV file. For more information, see [Creating a manifest file from a CSV file](ex-csv-manifest.md).

**To create a manifest file for image-level labels**

1. Create an empty text file.

1. Add a JSON line for each image the that you want to import. Each JSON line should look similar to the following.

   ```
   {"source-ref":"s3://custom-labels-console-us-east-1-nnnnnnnnnn/gt-job/manifest/IMG_1133.png","TestCLConsoleBucket":0,"TestCLConsoleBucket-metadata":{"confidence":0.95,"job-name":"labeling-job/testclconsolebucket","class-name":"Echo Dot","human-annotated":"yes","creation-date":"2020-04-15T20:17:23.433061","type":"groundtruth/image-classification"}}
   ```

1. Save the file. You can use the extension `.manifest`, but it is not required. 

1. Create a dataset using the manifest file that you created. For more information, see [To create a dataset using a SageMaker AI Ground Truth format manifest file (console)](md-create-manifest-file.md#create-dataset-procedure-manifest-file). 

 

## Image-Level JSON Lines


In this section, we show you how to create a JSON line for a single image. Consider the following image. A scene for the following image might be called *Sunrise*.

![\[Sunset over a lake with a dock and small boats, surrounded by mountains.\]](http://docs.aws.amazon.com/rekognition/latest/customlabels-dg/images/sunrise.png)


The JSON line for the preceding image, with the scene *Sunrise*, might be the following. 

```
{
    "source-ref": "s3://bucket/images/sunrise.png",
    "testdataset-classification_Sunrise": 1,
    "testdataset-classification_Sunrise-metadata": {
        "confidence": 1,
        "job-name": "labeling-job/testdataset-classification_Sunrise",
        "class-name": "Sunrise",
        "human-annotated": "yes",
        "creation-date": "2020-03-06T17:46:39.176",
        "type": "groundtruth/image-classification"
    }
}
```

Note the following information.

### source-ref


(Required) The Amazon S3 location of the image. The format is `"s3://BUCKET/OBJECT_PATH"`. Images in an imported dataset must be stored in the same Amazon S3 bucket. 

### *testdataset-classification\$1Sunrise*


(Required) The label attribute. You choose the field name. The field value (1 in the preceding example) is a label attribute identifier. It is not used by Amazon Rekognition Custom Labels and can be any integer value. There must be corresponding metadata identified by the field name with *-metadata* appended. For example, `"testdataset-classification_Sunrise-metadata"`. 

### *testdataset-classification\$1Sunrise*-metadata


(Required) Metadata about the label attribute. The field name must be the same as the label attribute with *-metadata* appended. 

*confidence*  
(Required) Currently not used by Amazon Rekognition Custom Labels but a value between 0 and 1 must be supplied. 

*job-name*  
(Optional) A name that you choose for the job that processes the image. 

*class-name*  
(Required) A class name that you choose for the scene or concept that applies to the image. For example, `"Sunrise"`. 

*human-annotated*  
(Required) Specify `"yes"`, if the annotation was completed by a human. Otherwise `"no"`. 

*creation-date*   
(Required) The Coordinated Universal Time (UTC) date and time that the label was created. 

*type*  
(Required) The type of processing that should be applied to the image. For image-level labels, the value is `"groundtruth/image-classification"`. 

### Adding multiple image-level labels to an image


You can add multiple labels to an image. For example, the follow JSON adds two labels, *football* and *ball* to a single image. 

```
{
    "source-ref": "S3 bucket location", 
    "sport0":0, # FIRST label
    "sport0-metadata": { 
        "class-name": "football", 
        "confidence": 0.8, 
        "type":"groundtruth/image-classification", 
        "job-name": "identify-sport", 
        "human-annotated": "yes", 
        "creation-date": "2018-10-18T22:18:13.527256" 
    },
    "sport1":1, # SECOND label
    "sport1-metadata": { 
        "class-name": "ball", 
        "confidence": 0.8, 
        "type":"groundtruth/image-classification", 
        "job-name": "identify-sport", 
        "human-annotated": "yes", 
        "creation-date": "2018-10-18T22:18:13.527256" 
    }
}  # end of annotations for 1 image
```

# Object localization in manifest files


You can import images labeled with object localization information by adding SageMaker AI Ground Truth [Bounding Box Job Output](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-data-output.html#sms-output-box) format JSON lines to a manifest file. 

Localization information represents the location of an object on an image. The location is represented by a bounding box that surrounds the object. The bounding box structure contains the upper-left coordinates of the bounding box and the bounding box's width and height. A bounding box format JSON line includes bounding boxes for the locations of one or more an objects on an image and the class of each object on the image. 

A manifest file is made of one or more JSON lines, each line contains the information for a single image.

**To create a manifest file for object localization**

1. Create an empty text file.

1. Add a JSON line for each image the that you want to import. Each JSON line should look similar to the following.

   ```
   {"source-ref": "s3://bucket/images/IMG_1186.png", "bounding-box": {"image_size": [{"width": 640, "height": 480, "depth": 3}], "annotations": [{ "class_id": 1,	"top": 251,	"left": 399, "width": 155, "height": 101}, {"class_id": 0, "top": 65, "left": 86, "width": 220,	"height": 334}]}, "bounding-box-metadata": {"objects": [{ "confidence": 1}, {"confidence": 1}],	"class-map": {"0": "Echo",	"1": "Echo Dot"}, "type": "groundtruth/object-detection", "human-annotated": "yes",	"creation-date": "2013-11-18T02:53:27", "job-name": "my job"}}
   ```

1. Save the file. You can use the extension `.manifest`, but it is not required. 

1. Create a dataset using the file that you just created. For more information, see [To create a dataset using a SageMaker AI Ground Truth format manifest file (console)](md-create-manifest-file.md#create-dataset-procedure-manifest-file). 



## Object bounding Box JSON lines


In this section, we show you how to create a JSON line for a single image. The following image shows bounding boxes around Amazon Echo and Amazon Echo Dot devices.

![\[Two Amazon smart speakers, one with green bounding box and one blue bounding box, on a wooden surface.\]](http://docs.aws.amazon.com/rekognition/latest/customlabels-dg/images/echos.png)


The following is the bounding box JSON line for the preceding image. 

```
{
	"source-ref": "s3://custom-labels-bucket/images/IMG_1186.png",
	"bounding-box": {
		"image_size": [{
			"width": 640,
			"height": 480,
			"depth": 3
		}],
		"annotations": [{
			"class_id": 1,
			"top": 251,
			"left": 399,
			"width": 155,
			"height": 101
		}, {
			"class_id": 0,
			"top": 65,
			"left": 86,
			"width": 220,
			"height": 334
		}]
	},
	"bounding-box-metadata": {
		"objects": [{
			"confidence": 1
		}, {
			"confidence": 1
		}],
		"class-map": {
			"0": "Echo",
			"1": "Echo Dot"
		},
		"type": "groundtruth/object-detection",
		"human-annotated": "yes",
		"creation-date": "2013-11-18T02:53:27",
		"job-name": "my job"
	}
}
```

Note the following information.

### source-ref


(Required) The Amazon S3 location of the image. The format is `"s3://BUCKET/OBJECT_PATH"`. Images in an imported dataset must be stored in the same Amazon S3 bucket. 

### *bounding-box*


(Required) The label attribute. You choose the field name. Contains the image size and the bounding boxes for each object detected in the image. There must be corresponding metadata identified by the field name with *-metadata* appended. For example, `"bounding-box-metadata"`. 

*image\$1size*  
(Required) A single element array containing the size of the image in pixels.   
+ *height* – (Required) The height of the image in pixels. 
+ *width* – (Required) The depth of the image in pixels. 
+ *depth* – (Required) The number of channels in the image. For RGB images, the value is 3. Not currently used by Amazon Rekognition Custom Labels, but a value is required. 

*annotations*  
(Required) An array of bounding box information for each object detected in the image.  
+ *class\$1id* – (Required) Maps to the label in *class-map*. In the preceding example, the object with the *class\$1id* of `1` is the Echo Dot in the image. 
+ *top* – (Required) The distance from the top of the image to the top of the bounding box, in pixels. 
+ *left* – (Required) The distance from the left of the image to the left of the bounding box, in pixels. 
+ *width* – (Required) The width of the bounding box, in pixels. 
+ *height* – (Required) The height of the bounding box, in pixels. 

### *bounding-box*-metadata


(Required) Metadata about the label attribute. The field name must be the same as the label attribute with *-metadata* appended. An array of bounding box information for each object detected in the image.

*Objects*  
(Required) An array of objects that are in the image. Maps to the *annotations* array by index. The confidence attribute isn't used by Amazon Rekognition Custom Labels. 

*class-map*  
(Required) A map of the classes that apply to objects detected in the image. 

*type*  
(Required) The type of classification job. `"groundtruth/object-detection"` identifies the job as object detection. 

*creation-date*   
(Required) The Coordinated Universal Time (UTC) date and time that the label was created. 

*human-annotated*  
(Required) Specify `"yes"`, if the annotation was completed by a human. Otherwise `"no"`. 

*job-name*  
(Optional) The name of the job that processes the image. 

# Validation rules for manifest files


 When you import a manifest file, Amazon Rekognition Custom Labels applies validation rules for limits, syntax, and semantics. The SageMaker AI Ground Truth schema enforces syntax validation. For more information, see [Output](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-data-output.html). The following are the validation rules for limits and semantics.

**Note**  
The 20% invalidity rules apply cumulatively across all validation rules. If the import exceeds the 20% limit due to any combination, such as 15% invalid JSON and 15% invalid images, the import fails. 
Each dataset object is a line in the manifest. Blank/invalid lines are also counted as dataset objects.
Overlaps are (common labels between test and train)/(train labels).

**Topics**
+ [

## Limits
](#md-validation-rules-limits)
+ [

## Semantics
](#md-validation-rules-semantics)

## Limits



| Validation | Limit | Error raised | 
| --- | --- | --- | 
|  Manifest file size  |  Maximum 1 GB  |  Error  | 
|  Maximum line count for a manifest file  |  Maximum of 250,000 dataset objects as lines in a manifest.   |  Error  | 
|  Lower boundary on total number of valid dataset objects per label   |  >= 1  |  Error  | 
|  Lower boundary on labels  |  >=2  |  Error  | 
|  Upper bound on labels  |  <= 250  |  Error  | 
|  Minimum bounding boxes per image  |  0  |  None  | 
|  Maximum bounding boxes per image  |  50  |  None  | 

## Semantics





| Validation | Limit | Error raised | 
| --- | --- | --- | 
|  Empty manifest  |    |  Error  | 
|  Missing/in-accessible source-ref object  |  Number of objects less than 20%  |  Warning  | 
|  Missing/in-accessible source-ref object  |  Number of objects > 20%  |  Error  | 
|  Test labels not present in training dataset   |  At least 50% overlap in the labels  |  Error  | 
|  Mix of label vs. object examples for same label in a dataset. Classification and detection for the same class in a dataset object.   |    |  No error or warning  | 
|  Overlapping assets between test and train   |  There should not be an overlap between test and training datasets.   |    | 
|  Images in a dataset must be from same bucket   |  Error if the objects are in a different bucket  |  Error  | 

# Converting other dataset formats to a manifest file
Converting other formats to a manifest file

You can use the following information to create Amazon SageMaker AI format manifest files from a variety of source dataset formats. After creating the manifest file, use it to create to a dataset. For more information, see [Using a manifest file to import images](md-create-dataset-ground-truth.md).

**Topics**
+ [

# Transforming a COCO dataset into a manifest file format
](md-transform-coco.md)
+ [

# Transforming multi-label SageMaker AI Ground Truth manifest files
](md-gt-cl-transform.md)
+ [

# Creating a manifest file from a CSV file
](ex-csv-manifest.md)

# Transforming a COCO dataset into a manifest file format


[COCO](http://cocodataset.org/#home) is a format for specifying large-scale object detection, segmentation, and captioning datasets. This Python [example](md-coco-transform-example.md) shows you how to transform a COCO object detection format dataset into an Amazon Rekognition Custom Labels [bounding box format manifest file](md-create-manifest-file-object-detection.md). This section also includes information that you can use to write your own code.

A COCO format JSON file consists of five sections providing information for *an entire dataset*. For more information, see [The COCO dataset format](md-coco-overview.md). 
+ `info` – general information about the dataset. 
+ `licenses `– license information for the images in the dataset.
+ [`images`](md-coco-overview.md#md-coco-images) – a list of images in the dataset.
+ [`annotations`](md-coco-overview.md#md-coco-annotations) – a list of annotations (including bounding boxes) that are present in all images in the dataset.
+ [`categories`](md-coco-overview.md#md-coco-categories) – a list of label categories.

You need information from the `images`, `annotations`, and `categories` lists to create an Amazon Rekognition Custom Labels manifest file.

An Amazon Rekognition Custom Labels manifest file is in JSON lines format where each line has the bounding box and label information for one or more objects *on an image*. For more information, see [Object localization in manifest files](md-create-manifest-file-object-detection.md).

## Mapping COCO Objects to a Custom Labels JSON Line


To transform a COCO format dataset, you map the COCO dataset to an Amazon Rekognition Custom Labels manifest file for object localization. For more information, see [Object localization in manifest files](md-create-manifest-file-object-detection.md). To build a JSON line for each image, the manifest file needs to map the COCO dataset `image`, `annotation`, and `category` object field IDs. 

The following is an example COCO manifest file. For more information, see [The COCO dataset format](md-coco-overview.md).

```
{
    "info": {
        "description": "COCO 2017 Dataset","url": "http://cocodataset.org","version": "1.0","year": 2017,"contributor": "COCO Consortium","date_created": "2017/09/01"
    },
    "licenses": [
        {"url": "http://creativecommons.org/licenses/by/2.0/","id": 4,"name": "Attribution License"}
    ],
    "images": [
        {"id": 242287, "license": 4, "coco_url": "http://images.cocodataset.org/val2017/xxxxxxxxxxxx.jpg", "flickr_url": "http://farm3.staticflickr.com/2626/xxxxxxxxxxxx.jpg", "width": 426, "height": 640, "file_name": "xxxxxxxxx.jpg", "date_captured": "2013-11-15 02:41:42"},
        {"id": 245915, "license": 4, "coco_url": "http://images.cocodataset.org/val2017/nnnnnnnnnnnn.jpg", "flickr_url": "http://farm1.staticflickr.com/88/xxxxxxxxxxxx.jpg", "width": 640, "height": 480, "file_name": "nnnnnnnnnn.jpg", "date_captured": "2013-11-18 02:53:27"}
    ],
    "annotations": [
        {"id": 125686, "category_id": 0, "iscrowd": 0, "segmentation": [[164.81, 417.51,......167.55, 410.64]], "image_id": 242287, "area": 42061.80340000001, "bbox": [19.23, 383.18, 314.5, 244.46]},
        {"id": 1409619, "category_id": 0, "iscrowd": 0, "segmentation": [[376.81, 238.8,........382.74, 241.17]], "image_id": 245915, "area": 3556.2197000000015, "bbox": [399, 251, 155, 101]},
        {"id": 1410165, "category_id": 1, "iscrowd": 0, "segmentation": [[486.34, 239.01,..........495.95, 244.39]], "image_id": 245915, "area": 1775.8932499999994, "bbox": [86, 65, 220, 334]}
    ],
    "categories": [
        {"supercategory": "speaker","id": 0,"name": "echo"},
        {"supercategory": "speaker","id": 1,"name": "echo dot"}
    ]
}
```

The following diagram shows how the COCO dataset lists for a *dataset* map to Amazon Rekognition Custom Labels JSON lines for an *image*. Every JSON line for an image posseess a source-ref, job, and job metadata field. Matching colors indicate information for a single image. Note that in the manifest an individual image may have multiple annotations and metadata/categories.

![\[Diagram showing the structure of Coco Manifest, with images, annotations, and categories contained within it.\]](http://docs.aws.amazon.com/rekognition/latest/customlabels-dg/images/coco-transform.png)


**To get the COCO objects for a single JSON line**

1. For each image in the images list, get the annotation from the annotations list where the value of the annotation field `image_id` matches the image `id` field.

1. For each annotation matched in step 1, read through the `categories` list and get each `category` where the value of the `category` field `id` matches the `annotation` object `category_id` field.

1. Create a JSON line for the image using the matched `image`, `annotation`, and `category` objects. To map the fields, see [Mapping COCO object fields to a Custom Labels JSON line object fields](#md-mapping-fields-coco). 

1. Repeat steps 1–3 until you have created JSON lines for each `image` object in the `images` list.

For example code, see [Transforming a COCO dataset](md-coco-transform-example.md).

## Mapping COCO object fields to a Custom Labels JSON line object fields


After you identify the COCO objects for an Amazon Rekognition Custom Labels JSON line, you need to map the COCO object fields to the respective Amazon Rekognition Custom Labels JSON line object fields. The following example Amazon Rekognition Custom Labels JSON line maps one image (`id`=`000000245915`) to the preceding COCO JSON example. Note the following information.
+ `source-ref` is the location of the image in an Amazon S3 bucket. If your COCO images aren't stored in an Amazon S3 bucket, you need to move them to an Amazon S3 bucket.
+ The `annotations` list contains an `annotation` object for each object on the image. An `annotation` object includes bounding box information (`top`, `left`,`width`, `height`) and a label identifier (`class_id`).
+ The label identifier (`class_id`) maps to the `class-map` list in the metadata. It lists the labels used on the image.

```
{
	"source-ref": "s3://custom-labels-bucket/images/000000245915.jpg",
	"bounding-box": {
		"image_size": {
			"width": 640,
			"height": 480,
			"depth": 3
		},
		"annotations": [{
			"class_id": 0,
			"top": 251,
			"left": 399,
			"width": 155,
			"height": 101
		}, {
			"class_id": 1,
			"top": 65,
			"left": 86,
			"width": 220,
			"height": 334
		}]
	},
	"bounding-box-metadata": {
		"objects": [{
			"confidence": 1
		}, {
			"confidence": 1
		}],
		"class-map": {
			"0": "Echo",
			"1": "Echo Dot"
		},
		"type": "groundtruth/object-detection",
		"human-annotated": "yes",
		"creation-date": "2018-10-18T22:18:13.527256",
		"job-name": "my job"
	}
}
```

Use the following information to map Amazon Rekognition Custom Labels manifest file fields to COCO dataset JSON fields. 

### source-ref


The S3 format URL for the location of the image. The image must be stored in an S3 bucket. For more information, see [source-ref](md-create-manifest-file-object-detection.md#cd-manifest-source-ref). If the `coco_url` COCO field points to an S3 bucket location, you can use the value of `coco_url` for the value of `source-ref`. Alternatively, you can map `source-ref` to the `file_name` (COCO) field and in your transform code, add the required S3 path to where the image is stored. 

### *bounding-box*


A label attribute name of your choosing. For more information, see [*bounding-box*](md-create-manifest-file-object-detection.md#md-manifest-source-bounding-box).

#### image\$1size


The size of the image in pixels. Maps to an `image` object in the [images](md-coco-overview.md#md-coco-images) list.
+ `height`-> `image.height`
+ `width`-> `image.width`
+ `depth`-> Not used by Amazon Rekognition Custom Labels but a value must be supplied.

#### annotations


A list of `annotation` objects. There’s one `annotation` for each object on the image.

#### annotation


Contains bounding box information for one instance of an object on the image. 
+ `class_id` -> numerical id mapping to Custom Label’s `class-map` list.
+ `top` -> `bbox[1]`
+ `left` -> `bbox[0]`
+ `width` -> `bbox[2]`
+ `height` -> `bbox[3]`

### *bounding-box*-metadata


Metadata for the label attribute. Includes the labels and label identifiers. For more information, see [*bounding-box*-metadata](md-create-manifest-file-object-detection.md#md-manifest-source-bounding-box-metadata).

#### Objects


An array of objects in the image. Maps to the `annotations` list by index.

##### Object

+ `confidence`->Not used by Amazon Rekognition Custom Labels, but a value (1) is required.

#### class-map


A map of the labels (classes) that apply to objects detected in the image. Maps to category objects in the [categories](md-coco-overview.md#md-coco-categories) list.
+ `id` -> `category.id`
+ `id value` -> `category.name`

#### type


Must be `groundtruth/object-detection`

#### human-annotated


Specify `yes` or `no`. For more information, see [*bounding-box*-metadata](md-create-manifest-file-object-detection.md#md-manifest-source-bounding-box-metadata).

#### creation-date -> [image](md-coco-overview.md#md-coco-images).date\$1captured


The creation date and time of the image. Maps to the [image](md-coco-overview.md#md-coco-images).date\$1captured field of an image in the COCO images list. Amazon Rekognition Custom Labels expects the format of `creation-date` to be *Y-M-DTH:M:S*.

#### job-name


A job name of your choosing. 

# The COCO dataset format


A COCO dataset consists of five sections of information that provide information for the entire dataset. The format for a COCO object detection dataset is documented at [COCO Data Format](http://cocodataset.org/#format-data). 
+ info – general information about the dataset. 
+ licenses – license information for the images in the dataset.
+ [images](#md-coco-images) – a list of images in the dataset.
+ [annotations](#md-coco-annotations) – a list of annotations (including bounding boxes) that are present in all images in the dataset.
+ [categories](#md-coco-categories) – a list of label categories.

To create a Custom Labels manifest, you use the `images`, `annotations`, and `categories` lists from the COCO manifest file. The other sections (`info`, `licences`) aren’t required. The following is an example COCO manifest file.

```
{
    "info": {
        "description": "COCO 2017 Dataset","url": "http://cocodataset.org","version": "1.0","year": 2017,"contributor": "COCO Consortium","date_created": "2017/09/01"
    },
    "licenses": [
        {"url": "http://creativecommons.org/licenses/by/2.0/","id": 4,"name": "Attribution License"}
    ],
    "images": [
        {"id": 242287, "license": 4, "coco_url": "http://images.cocodataset.org/val2017/xxxxxxxxxxxx.jpg", "flickr_url": "http://farm3.staticflickr.com/2626/xxxxxxxxxxxx.jpg", "width": 426, "height": 640, "file_name": "xxxxxxxxx.jpg", "date_captured": "2013-11-15 02:41:42"},
        {"id": 245915, "license": 4, "coco_url": "http://images.cocodataset.org/val2017/nnnnnnnnnnnn.jpg", "flickr_url": "http://farm1.staticflickr.com/88/xxxxxxxxxxxx.jpg", "width": 640, "height": 480, "file_name": "nnnnnnnnnn.jpg", "date_captured": "2013-11-18 02:53:27"}
    ],
    "annotations": [
        {"id": 125686, "category_id": 0, "iscrowd": 0, "segmentation": [[164.81, 417.51,......167.55, 410.64]], "image_id": 242287, "area": 42061.80340000001, "bbox": [19.23, 383.18, 314.5, 244.46]},
        {"id": 1409619, "category_id": 0, "iscrowd": 0, "segmentation": [[376.81, 238.8,........382.74, 241.17]], "image_id": 245915, "area": 3556.2197000000015, "bbox": [399, 251, 155, 101]},
        {"id": 1410165, "category_id": 1, "iscrowd": 0, "segmentation": [[486.34, 239.01,..........495.95, 244.39]], "image_id": 245915, "area": 1775.8932499999994, "bbox": [86, 65, 220, 334]}
    ],
    "categories": [
        {"supercategory": "speaker","id": 0,"name": "echo"},
        {"supercategory": "speaker","id": 1,"name": "echo dot"}
    ]
}
```

## images list


The images referenced by a COCO dataset are listed in the images array. Each image object contains information about the image such as the image file name. In the following example image object, note the following information and which fields are required to create an Amazon Rekognition Custom Labels manifest file.
+ `id` – (Required) A unique identifier for the image. The `id` field maps to the `id` field in the annotations array (where bounding box information is stored).
+ `license` – (Not Required) Maps to the license array. 
+ `coco_url` – (Optional) The location of the image.
+ `flickr_url` – (Not required) The location of the image on Flickr.
+ `width` – (Required) The width of the image.
+ `height` – (Required) The height of the image.
+ `file_name` – (Required) The image file name. In this example, `file_name` and `id` match, but this is not a requirement for COCO datasets. 
+ `date_captured` –(Required) the date and time the image was captured. 

```
{
    "id": 245915,
    "license": 4,
    "coco_url": "http://images.cocodataset.org/val2017/nnnnnnnnnnnn.jpg",
    "flickr_url": "http://farm1.staticflickr.com/88/nnnnnnnnnnnnnnnnnnn.jpg",
    "width": 640,
    "height": 480,
    "file_name": "000000245915.jpg",
    "date_captured": "2013-11-18 02:53:27"
}
```

## annotations (bounding boxes) list


Bounding box information for all objects on all images is stored the annotations list. A single annotation object contains bounding box information for a single object and the object's label on an image. There is an annotation object for each instance of an object on an image. 

In the following example, note the following information and which fields are required to create an Amazon Rekognition Custom Labels manifest file. 
+ `id` – (Not required) The identifier for the annotation.
+ `image_id` – (Required) Corresponds to the image `id` in the images array.
+ `category_id` – (Required) The identifier for the label that identifies the object within a bounding box. It maps to the `id` field of the categories array. 
+ `iscrowd` – (Not required) Specifies if the image contains a crowd of objects. 
+ `segmentation` – (Not required) Segmentation information for objects on an image. Amazon Rekognition Custom Labels doesn't support segmentation. 
+ `area` – (Not required) The area of the annotation.
+ `bbox` – (Required) Contains the coordinates, in pixels, of a bounding box around an object on the image.

```
{
    "id": 1409619,
    "category_id": 1,
    "iscrowd": 0,
    "segmentation": [
        [86.0, 238.8,..........382.74, 241.17]
    ],
    "image_id": 245915,
    "area": 3556.2197000000015,
    "bbox": [86, 65, 220, 334]
}
```

## categories list


Label information is stored the categories array. In the following example category object, note the following information and which fields are required to create an Amazon Rekognition Custom Labels manifest file. 
+ `supercategory` – (Not required) The parent category for a label. 
+ `id` – (Required) The label identifier. The `id` field maps to the `category_id` field in an `annotation` object. In the following example, The identifier for an echo dot is 2. 
+ `name` – (Required) the label name. 

```
        {"supercategory": "speaker","id": 2,"name": "echo dot"}
```

# Transforming a COCO dataset


Use the following Python example to transform bounding box information from a COCO format dataset into an Amazon Rekognition Custom Labels manifest file. The code uploads the created manifest file to your Amazon S3 bucket. The code also provides an AWS CLI command that you can use to upload your images. 

**To transform a COCO dataset (SDK)**

1. If you haven't already:

   1. Make sure you have `AmazonS3FullAccess` permissions. For more information, see [Set up SDK permissions](su-sdk-permissions.md).

   1. Install and configure the AWS CLI and the AWS SDKs. For more information, see [Step 4: Set up the AWS CLI and AWS SDKs](su-awscli-sdk.md).

1. Use the following Python code to transform a COCO dataset. Set the following values.
   + `s3_bucket` – The name of the S3 bucket in which you want to store the images and Amazon Rekognition Custom Labels manifest file. 
   + `s3_key_path_images` – The path to where you want to place the images within the S3 bucket (`s3_bucket`).
   + `s3_key_path_manifest_file` – The path to where you want to place the Custom Labels manifest file within the S3 bucket (`s3_bucket`).
   + `local_path` – The local path to where the example opens the input COCO dataset and also saves the new Custom Labels manifest file.
   + `local_images_path` – The local path to the images that you want to use for training.
   + `coco_manifest` – The input COCO dataset filename.
   + `cl_manifest_file` – A name for the manifest file created by the example. The file is saved at the location specified by `local_path`. By convention, the file has the extension `.manifest`, but this is not required.
   + `job_name` – A name for the Custom Labels job.

   ```
   import json
   import os
   import random
   import shutil
   import datetime
   import botocore
   import boto3
   import PIL.Image as Image
   import io
   
   #S3 location for images
   s3_bucket = 'bucket'
   s3_key_path_manifest_file = 'path to custom labels manifest file/'
   s3_key_path_images = 'path to images/'
   s3_path='s3://' + s3_bucket  + '/' + s3_key_path_images
   s3 = boto3.resource('s3')
   
   #Local file information
   local_path='path to input COCO dataset and output Custom Labels manifest/'
   local_images_path='path to COCO images/'
   coco_manifest = 'COCO dataset JSON file name'
   coco_json_file = local_path + coco_manifest
   job_name='Custom Labels job name'
   cl_manifest_file = 'custom_labels.manifest'
   
   label_attribute ='bounding-box'
   
   open(local_path + cl_manifest_file, 'w').close()
   
   # class representing a Custom Label JSON line for an image
   class cl_json_line:  
       def __init__(self,job, img):  
   
           #Get image info. Annotations are dealt with seperately
           sizes=[]
           image_size={}
           image_size["width"] = img["width"]
           image_size["depth"] = 3
           image_size["height"] = img["height"]
           sizes.append(image_size)
   
           bounding_box={}
           bounding_box["annotations"] = []
           bounding_box["image_size"] = sizes
   
           self.__dict__["source-ref"] = s3_path + img['file_name']
           self.__dict__[job] = bounding_box
   
           #get metadata
           metadata = {}
           metadata['job-name'] = job_name
           metadata['class-map'] = {}
           metadata['human-annotated']='yes'
           metadata['objects'] = [] 
           date_time_obj = datetime.datetime.strptime(img['date_captured'], '%Y-%m-%d %H:%M:%S')
           metadata['creation-date']= date_time_obj.strftime('%Y-%m-%dT%H:%M:%S') 
           metadata['type']='groundtruth/object-detection'
           
           self.__dict__[job + '-metadata'] = metadata
   
   
   print("Getting image, annotations, and categories from COCO file...")
   
   with open(coco_json_file) as f:
   
       #Get custom label compatible info    
       js = json.load(f)
       images = js['images']
       categories = js['categories']
       annotations = js['annotations']
   
       print('Images: ' + str(len(images)))
       print('annotations: ' + str(len(annotations)))
       print('categories: ' + str(len (categories)))
   
   
   print("Creating CL JSON lines...")
       
   images_dict = {image['id']: cl_json_line(label_attribute, image) for image in images}
   
   print('Parsing annotations...')
   for annotation in annotations:
   
       image=images_dict[annotation['image_id']]
   
       cl_annotation = {}
       cl_class_map={}
   
       # get bounding box information
       cl_bounding_box={}
       cl_bounding_box['left'] = annotation['bbox'][0]
       cl_bounding_box['top'] = annotation['bbox'][1]
    
       cl_bounding_box['width'] = annotation['bbox'][2]
       cl_bounding_box['height'] = annotation['bbox'][3]
       cl_bounding_box['class_id'] = annotation['category_id']
   
       getattr(image, label_attribute)['annotations'].append(cl_bounding_box)
   
   
       for category in categories:
            if annotation['category_id'] == category['id']:
               getattr(image, label_attribute + '-metadata')['class-map'][category['id']]=category['name']
           
       
       cl_object={}
       cl_object['confidence'] = int(1)  #not currently used by Custom Labels
       getattr(image, label_attribute + '-metadata')['objects'].append(cl_object)
   
   print('Done parsing annotations')
   
   # Create manifest file.
   print('Writing Custom Labels manifest...')
   
   for im in images_dict.values():
   
       with open(local_path+cl_manifest_file, 'a+') as outfile:
               json.dump(im.__dict__,outfile)
               outfile.write('\n')
               outfile.close()
   
   # Upload manifest file to S3 bucket.
   print ('Uploading Custom Labels manifest file to S3 bucket')
   print('Uploading'  + local_path + cl_manifest_file + ' to ' + s3_key_path_manifest_file)
   print(s3_bucket)
   s3 = boto3.resource('s3')
   s3.Bucket(s3_bucket).upload_file(local_path + cl_manifest_file, s3_key_path_manifest_file + cl_manifest_file)
   
   # Print S3 URL to manifest file,
   print ('S3 URL Path to manifest file. ')
   print('\033[1m s3://' + s3_bucket + '/' + s3_key_path_manifest_file + cl_manifest_file + '\033[0m') 
   
   # Display aws s3 sync command.
   print ('\nAWS CLI s3 sync command to upload your images to S3 bucket. ')
   print ('\033[1m aws s3 sync ' + local_images_path + ' ' + s3_path + '\033[0m')
   ```

1. Run the code.

1. In the program output, note the `s3 sync` command. You need it in the next step.

1. At the command prompt, run the `s3 sync` command. Your images are uploaded to the S3 bucket. If the command fails during upload, run it again until your local images are synchronized with the S3 bucket.

1. In the program output, note the S3 URL path to the manifest file. You need it in the next step.

1. Follow the instruction at [Creating a dataset with a SageMaker AI Ground Truth manifest file (Console)](md-create-dataset-ground-truth.md#md-create-dataset-ground-truth-console) to create a dataset with the uploaded manifest file. For step 8, in **.manifest file location**, enter the Amazon S3 URL you noted in the previous step. If you are using the AWS SDK, do [Creating a dataset with a SageMaker AI Ground Truth manifest file (SDK)](md-create-dataset-ground-truth.md#md-create-dataset-ground-truth-sdk).

# Transforming multi-label SageMaker AI Ground Truth manifest files
Transforming multi-label Ground Truth manifest files

This topic shows you how to transform a multi-label Amazon SageMaker AI Ground Truth manifest file to an Amazon Rekognition Custom Labels format manifest file. 

SageMaker AI Ground Truth manifest files for multi-label jobs are formatted differently than Amazon Rekognition Custom Labels format manifest files. Multi-label classification is when an image is classified into a set of classes, but might belong to multiple classes at once. In this case, the image can potentially have multiple labels (multi-label), such as *football* and *ball*.

For information about multi-label SageMaker AI Ground Truth jobs, see [Image Classification (Multi-label)](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-image-classification-multilabel.html). For information about multi-label format Amazon Rekognition Custom Labels manifest files, see [Adding multiple image-level labels to an image](md-create-manifest-file-classification.md#md-dataset-purpose-classification-multiple-labels).

## Getting the manifest file for a SageMaker AI Ground Truth job


The following procedure shows you how to get the output manifest file (`output.manifest`) for an Amazon SageMaker AI Ground Truth job. You use `output.manifest` as input to the next procedure.

**To download a SageMaker AI Ground Truth job manifest file**

1. Open the [https://console.aws.amazon.com/sagemaker/](https://console.aws.amazon.com/sagemaker/). 

1. In the navigation pane, choose **Ground Truth** and then choose **Labeling Jobs**. 

1. Choose the labeling job that contains the manifest file that you want to use.

1. On the details page, choose the link under **Output dataset location**. The Amazon S3 console is opened at the dataset location. 

1. Choose `Manifests`, `output` and then `output.manifest`.

1. Choose **Object Actions** and then choose **Download** to download the manifest file.

## Transforming a multi-label SageMaker AI manifest file


The following procedure creates a multi-label format Amazon Rekognition Custom Labels manifest file from an existing multi-label format SageMaker AI GroundTruth manifest file.

**Note**  
To run the code, you need Python version 3, or higher.<a name="md-procedure-multi-label-transform"></a>

**To transform a multi-label SageMaker AI manifest file**

1. Run the following python code. Supply the name of the manifest file that you created in [Getting the manifest file for a SageMaker AI Ground Truth job](#md-get-gt-manifest) as a command line argument.

   ```
   # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
   # SPDX-License-Identifier:  Apache-2.0
   """
   Purpose
   Shows how to create and Amazon Rekognition Custom Labels format
   manifest file from an Amazon SageMaker Ground Truth Image
   Classification (Multi-label) format manifest file.
   """
   import json
   import logging
   import argparse
   import os.path
   
   logger = logging.getLogger(__name__)
   
   def create_manifest_file(ground_truth_manifest_file):
       """
       Creates an Amazon Rekognition Custom Labels format manifest file from
       an Amazon SageMaker Ground Truth Image Classification (Multi-label) format
       manifest file.
       :param: ground_truth_manifest_file: The name of the Ground Truth manifest file,
       including the relative path.
       :return: The name of the new Custom Labels manifest file.
       """
   
       logger.info('Creating manifest file from %s', ground_truth_manifest_file)
       new_manifest_file = f'custom_labels_{os.path.basename(ground_truth_manifest_file)}'
   
       # Read the SageMaker Ground Truth manifest file into memory.
       with open(ground_truth_manifest_file) as gt_file:
           lines = gt_file.readlines()
   
       #Iterate through the lines one at a time to generate the
       #new lines for the Custom Labels manifest file.
       with open(new_manifest_file, 'w') as the_new_file:
           for line in lines:
               #job_name - The of the Amazon Sagemaker Ground Truth job.
               job_name = ''
               # Load in the old json item from the Ground Truth manifest file
               old_json = json.loads(line)
   
               # Get the job name
               keys = old_json.keys()
               for key in keys:
                   if 'source-ref' not in key and '-metadata' not in key:
                       job_name = key
   
               new_json = {}
               # Set the location of the image
               new_json['source-ref'] = old_json['source-ref']
   
               # Temporarily store the list of labels
               labels = old_json[job_name]
   
               # Iterate through the labels and reformat to Custom Labels format
               for index, label in enumerate(labels):
                   new_json[f'{job_name}{index}'] = index
                   metadata = {}
                   metadata['class-name'] = old_json[f'{job_name}-metadata']['class-map'][str(label)]
                   metadata['confidence'] = old_json[f'{job_name}-metadata']['confidence-map'][str(label)]
                   metadata['type'] = 'groundtruth/image-classification'
                   metadata['job-name'] = old_json[f'{job_name}-metadata']['job-name']
                   metadata['human-annotated'] = old_json[f'{job_name}-metadata']['human-annotated']
                   metadata['creation-date'] = old_json[f'{job_name}-metadata']['creation-date']
                   # Add the metadata to new json line
                   new_json[f'{job_name}{index}-metadata'] = metadata
               # Write the current line to the json file
               the_new_file.write(json.dumps(new_json))
               the_new_file.write('\n')
   
       logger.info('Created %s', new_manifest_file)
       return  new_manifest_file
   
   def add_arguments(parser):
       """
       Adds command line arguments to the parser.
       :param parser: The command line parser.
       """
   
       parser.add_argument(
           "manifest_file", help="The Amazon SageMaker Ground Truth manifest file"
           "that you want to use."
       )
   
   
   def main():
       logging.basicConfig(level=logging.INFO,
                           format="%(levelname)s: %(message)s")
       try:
           # get command line arguments
           parser = argparse.ArgumentParser(usage=argparse.SUPPRESS)
           add_arguments(parser)
           args = parser.parse_args()
           # Create the manifest file
           manifest_file = create_manifest_file(args.manifest_file)
           print(f'Manifest file created: {manifest_file}')
       except FileNotFoundError as err:
           logger.exception('File not found: %s', err)
           print(f'File not found: {err}. Check your manifest file.')
   
   if __name__ == "__main__":
       main()
   ```

1. Note the name of the new manifest file that the script displays. You use it in the next step.

1. [Upload your manifest files](https://docs.aws.amazon.com/AmazonS3/latest/userguide/upload-objects.html) to the Amazon S3 bucket that you want to use for storing the manifest file.
**Note**  
Make sure Amazon Rekognition Custom Labels has access to the Amazon S3 bucket referenced in the `source-ref` field of the manifest file JSON lines. For more information, see [Accessing external Amazon S3 Buckets](su-console-policy.md#su-external-buckets). If your Ground Truth job stores images in the Amazon Rekognition Custom Labels Console Bucket, you don't need to add permissions.

1. Follow the instructions at [Creating a dataset with a SageMaker AI Ground Truth manifest file (Console)](md-create-dataset-ground-truth.md#md-create-dataset-ground-truth-console) to create a dataset with the uploaded manifest file. For step 8, in **.manifest file location**, enter the Amazon S3 URL for the location of the manifest file. If you are using the AWS SDK, do [Creating a dataset with a SageMaker AI Ground Truth manifest file (SDK)](md-create-dataset-ground-truth.md#md-create-dataset-ground-truth-sdk).

# Creating a manifest file from a CSV file


This example Python script simplifies the creation of a manifest file by using a Comma Separated Values (CSV) file to label images. You create the CSV file. The manifest file is suitable for [Multi-label image classification](getting-started.md#gs-multi-label-image-classification-example) or [Multi-label image classification](getting-started.md#gs-multi-label-image-classification-example). For more information, see [Find objects, scenes, and concepts](understanding-custom-labels.md#tm-classification). 

**Note**  
This script doesn't create a manifest file suitable for finding [object locations](understanding-custom-labels.md#tm-object-localization) or for finding [brand locations](understanding-custom-labels.md#tm-brand-detection-localization).

A manifest file describes the images used to train a model. For example, image locations and labels assigned to images. A manifest file is made up of one or more JSON lines. Each JSON line describes a single image. For more information, see [Importing image-level labels in manifest files](md-create-manifest-file-classification.md).

A CSV file represents tabular data over multiple rows in a text file. Fields on a row are separated by commas. For more information, see [comma separated values](https://en.wikipedia.org/wiki/Comma-separated_values). For this script, each row in your CSV file represents a single image and maps to a JSON Line in the manifest file. To create a CSV file for a manifest file that supports [Multi-label image classification](getting-started.md#gs-multi-label-image-classification-example), you add one or more image-level labels to each row. To create a manifest file suitable for [Image classification](getting-started.md#gs-image-classification-example), you add a single image-level label to each row.

For example, The following CSV file describes the images in the [Multi-label image classification](getting-started.md#gs-multi-label-image-classification-example) (Flowers) *Getting started* project. 

```
camellia1.jpg,camellia,with_leaves
camellia2.jpg,camellia,with_leaves
camellia3.jpg,camellia,without_leaves
helleborus1.jpg,helleborus,without_leaves,not_fully_grown
helleborus2.jpg,helleborus,with_leaves,fully_grown
helleborus3.jpg,helleborus,with_leaves,fully_grown
jonquil1.jpg,jonquil,with_leaves
jonquil2.jpg,jonquil,with_leaves
jonquil3.jpg,jonquil,with_leaves
jonquil4.jpg,jonquil,without_leaves
mauve_honey_myrtle1.jpg,mauve_honey_myrtle,without_leaves
mauve_honey_myrtle2.jpg,mauve_honey_myrtle,with_leaves
mauve_honey_myrtle3.jpg,mauve_honey_myrtle,with_leaves
mediterranean_spurge1.jpg,mediterranean_spurge,with_leaves
mediterranean_spurge2.jpg,mediterranean_spurge,without_leaves
```

The script generates JSON Lines for each row. For example, the following is the JSON Line for the first row (`camellia1.jpg,camellia,with_leaves`) .

```
{"source-ref": "s3://bucket/flowers/train/camellia1.jpg","camellia": 1,"camellia-metadata":{"confidence": 1,"job-name": "labeling-job/camellia","class-name": "camellia","human-annotated": "yes","creation-date": "2022-01-21T14:21:05","type": "groundtruth/image-classification"},"with_leaves": 1,"with_leaves-metadata":{"confidence": 1,"job-name": "labeling-job/with_leaves","class-name": "with_leaves","human-annotated": "yes","creation-date": "2022-01-21T14:21:05","type": "groundtruth/image-classification"}}
```

In the example CSV, the Amazon S3 path to the image is not present. If your CSV file doesn't include the Amazon S3 path for the images, use the `--s3_path` command line argument to specify the Amazon S3 path to the image. 

The script records the first entry for each image in a deduplicated image CSV file. The deduplicated image CSV file contains a single instance of each image found in the input CSV file. Further occurrences of an image in the input CSV file are recorded in a duplicate image CSV file. If the script finds duplicate images, review the duplicate image CSV file and update the deduplicated image CSV file as necessary. Rerun the script with the deduplicated file. If no duplicates are found in the input CSV file, the script deletes the deduplicated image CSV file and duplicate image CSVfile, as they are empty. 

 In this procedure, you create the CSV file and run the Python script to create the manifest file. 

**To create a manifest file from a CSV file**

1. Create a CSV file with the following fields in each row (one row per image). Don't add a header row to the CSV file.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/rekognition/latest/customlabels-dg/ex-csv-manifest.html)

   For example `camellia1.jpg,camellia,with_leaves` or `s3://my-bucket/flowers/train/camellia1.jpg,camellia,with_leaves` 

1. Save the CSV file.

1. Run the following Python script. Supply the following arguments:
   + `csv_file` – The CSV file that you created in step 1. 
   + `manifest_file` – The name of the manifest file that you want to create.
   + (Optional)`--s3_path s3://path_to_folder/` – The Amazon S3 path to add to the image file names (field 1). Use `--s3_path` if the images in field 1 don't already contain an S3 path.

   ```
   # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
   # SPDX-License-Identifier:  Apache-2.0
   
   from datetime import datetime, timezone
   import argparse
   import logging
   import csv
   import os
   import json
   
   """
   Purpose
   Amazon Rekognition Custom Labels model example used in the service documentation.
   Shows how to create an image-level (classification) manifest file from a CSV file.
   You can specify multiple image level labels per image.
   CSV file format is
   image,label,label,..
   If necessary, use the bucket argument to specify the S3 bucket folder for the images.
   https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/md-gt-cl-transform.html
   """
   
   logger = logging.getLogger(__name__)
   
   
   def check_duplicates(csv_file, deduplicated_file, duplicates_file):
       """
       Checks for duplicate images in a CSV file. If duplicate images
       are found, deduplicated_file is the deduplicated CSV file - only the first
       occurence of a duplicate is recorded. Other duplicates are recorded in duplicates_file.
       :param csv_file: The source CSV file.
       :param deduplicated_file: The deduplicated CSV file to create. If no duplicates are found
       this file is removed.
       :param duplicates_file: The duplicate images CSV file to create. If no duplicates are found
       this file is removed.
       :return: True if duplicates are found, otherwise false.
       """
   
       logger.info("Deduplicating %s", csv_file)
   
       duplicates_found = False
   
       # Find duplicates.
       with open(csv_file, 'r', newline='', encoding="UTF-8") as f,\
               open(deduplicated_file, 'w', encoding="UTF-8") as dedup,\
               open(duplicates_file, 'w', encoding="UTF-8") as duplicates:
   
           reader = csv.reader(f, delimiter=',')
           dedup_writer = csv.writer(dedup)
           duplicates_writer = csv.writer(duplicates)
   
           entries = set()
           for row in reader:
               # Skip empty lines.
               if not ''.join(row).strip():
                   continue
   
               key = row[0]
               if key not in entries:
                   dedup_writer.writerow(row)
                   entries.add(key)
               else:
                   duplicates_writer.writerow(row)
                   duplicates_found = True
   
       if duplicates_found:
           logger.info("Duplicates found check %s", duplicates_file)
   
       else:
           os.remove(duplicates_file)
           os.remove(deduplicated_file)
   
       return duplicates_found
   
   
   def create_manifest_file(csv_file, manifest_file, s3_path):
       """
       Reads a CSV file and creates a Custom Labels classification manifest file.
       :param csv_file: The source CSV file.
       :param manifest_file: The name of the manifest file to create.
       :param s3_path: The S3 path to the folder that contains the images.
       """
       logger.info("Processing CSV file %s", csv_file)
   
       image_count = 0
       label_count = 0
   
       with open(csv_file, newline='', encoding="UTF-8") as csvfile,\
               open(manifest_file, "w", encoding="UTF-8") as output_file:
   
           image_classifications = csv.reader(
               csvfile, delimiter=',', quotechar='|')
   
           # Process each row (image) in CSV file.
           for row in image_classifications:
               source_ref = str(s3_path)+row[0]
   
               image_count += 1
   
               # Create JSON for image source ref.
               json_line = {}
               json_line['source-ref'] = source_ref
   
               # Process each image level label.
               for index in range(1, len(row)):
                   image_level_label = row[index]
   
                   # Skip empty columns.
                   if image_level_label == '':
                       continue
                   label_count += 1
   
                  # Create the JSON line metadata.
                   json_line[image_level_label] = 1
                   metadata = {}
                   metadata['confidence'] = 1
                   metadata['job-name'] = 'labeling-job/' + image_level_label
                   metadata['class-name'] = image_level_label
                   metadata['human-annotated'] = "yes"
                   metadata['creation-date'] = \
                       datetime.now(timezone.utc).strftime('%Y-%m-%dT%H:%M:%S.%f')
                   metadata['type'] = "groundtruth/image-classification"
   
                   json_line[f'{image_level_label}-metadata'] = metadata
   
                   # Write the image JSON Line.
               output_file.write(json.dumps(json_line))
               output_file.write('\n')
   
       output_file.close()
       logger.info("Finished creating manifest file %s\nImages: %s\nLabels: %s",
                   manifest_file, image_count, label_count)
   
       return image_count, label_count
   
   
   def add_arguments(parser):
       """
       Adds command line arguments to the parser.
       :param parser: The command line parser.
       """
   
       parser.add_argument(
           "csv_file", help="The CSV file that you want to process."
       )
   
       parser.add_argument(
           "--s3_path", help="The S3 bucket and folder path for the images."
           " If not supplied, column 1 is assumed to include the S3 path.", required=False
       )
   
   
   def main():
   
       logging.basicConfig(level=logging.INFO,
                           format="%(levelname)s: %(message)s")
   
       try:
   
           # Get command line arguments
           parser = argparse.ArgumentParser(usage=argparse.SUPPRESS)
           add_arguments(parser)
           args = parser.parse_args()
   
           s3_path = args.s3_path
           if s3_path is None:
               s3_path = ''
   
           # Create file names.
           csv_file = args.csv_file
           file_name = os.path.splitext(csv_file)[0]
           manifest_file = f'{file_name}.manifest'
           duplicates_file = f'{file_name}-duplicates.csv'
           deduplicated_file = f'{file_name}-deduplicated.csv'
   
           # Create manifest file, if there are no duplicate images.
           if check_duplicates(csv_file, deduplicated_file, duplicates_file):
               print(f"Duplicates found. Use {duplicates_file} to view duplicates "
                     f"and then update {deduplicated_file}. ")
               print(f"{deduplicated_file} contains the first occurence of a duplicate. "
                     "Update as necessary with the correct label information.")
               print(f"Re-run the script with {deduplicated_file}")
           else:
               print("No duplicates found. Creating manifest file.")
   
               image_count, label_count = create_manifest_file(csv_file,
                                                               manifest_file,
                                                               s3_path)
   
               print(f"Finished creating manifest file: {manifest_file} \n"
                     f"Images: {image_count}\nLabels: {label_count}")
   
       except FileNotFoundError as err:
           logger.exception("File not found: %s", err)
           print(f"File not found: {err}. Check your input CSV file.")
   
   
   if __name__ == "__main__":
       main()
   ```

1. If you plan to use a test dataset, repeat steps 1–3 to create a manifest file for your test dataset.

1. If necessary, copy the images to the Amazon S3 bucket path that you specified in column 1 of the CSV file (or specified in the `--s3_path` command line). You can use the following AWS S3 command.

   ```
   aws s3 cp --recursive your-local-folder s3://your-target-S3-location
   ```

1. [Upload your manifest files](https://docs.aws.amazon.com/AmazonS3/latest/userguide/upload-objects.html) to the Amazon S3 bucket that you want to use for storing the manifest file.
**Note**  
Make sure Amazon Rekognition Custom Labels has access to the Amazon S3 bucket referenced in the `source-ref` field of the manifest file JSON lines. For more information, see [Accessing external Amazon S3 Buckets](su-console-policy.md#su-external-buckets). If your Ground Truth job stores images in the Amazon Rekognition Custom Labels Console Bucket, you don't need to add permissions.

1. Follow the instructions at [Creating a dataset with a SageMaker AI Ground Truth manifest file (Console)](md-create-dataset-ground-truth.md#md-create-dataset-ground-truth-console) to create a dataset with the uploaded manifest file. For step 8, in **.manifest file location**, enter the Amazon S3 URL for the location of the manifest file. If you are using the AWS SDK, do [Creating a dataset with a SageMaker AI Ground Truth manifest file (SDK)](md-create-dataset-ground-truth.md#md-create-dataset-ground-truth-sdk).

# Copying content from an existing dataset


If you've previously created a dataset, you can copy its contents to a new dataset. To create a dataset from an existing dataset with the AWS SDK, see [Creating a dataset using an existing dataset (SDK)](md-create-dataset-existing-dataset-sdk.md).

**To create a dataset using an existing Amazon Rekognition Custom Labels dataset (console)**

1. Open the Amazon Rekognition console at [https://console.aws.amazon.com/rekognition/](https://console.aws.amazon.com/rekognition/).

1. Choose **Use Custom Labels**.

1. Choose **Get started**. 

1. In the left navigation pane, choose **Projects**.

1. In the **Projects** page, choose the project to which you want to add a dataset. The details page for your project is displayed.

1. Choose **Create dataset**. The **Create dataset** page is shown.

1. In **Starting configuration**, choose either **Start with a single dataset** or **Start with a training dataset**. To create a higher quality model, we recommend starting with separate training and test datasets.

------
#### [ Single dataset ]

   1. In the **Training dataset details** section, choose **Copy an existing Amazon Rekognition Custom Labels dataset**.

   1. In the **Training dataset details** section, in the **Dataset** edit box, type or select the name of the dataset that you want to copy. 

   1. Choose **Create Dataset**. The datasets page for your project opens.

------
#### [ Separate training and test datasets ]

   1. In the **Training dataset details** section, choose **Copy an existing Amazon Rekognition Custom Labels dataset**.

   1. In the **Training dataset details** section, in the **Dataset** edit box, type or select the name of the dataset that you want to copy. 

   1. In the **Test dataset details** section, choose **Copy an existing Amazon Rekognition Custom Labels dataset**.

   1. In the **Test dataset details** section, in the **Dataset** edit box, type or select the name of the dataset that you want to copy. 
**Note**  
Your training and test datasets can have different image sources.

   1. Choose **Create Datasets**. The datasets page for your project opens.

------

1. If you need to add or change labels, do [Labeling images](md-labeling-images.md).

1. Follow the steps in [Training a model (Console)](training-model.md#tm-console) to train your model.

# Labeling images


A label identifies an object, scene, concept, or bounding box around an object in an image. For example, if your dataset contains images of dogs, you might add labels for breeds of dogs. 

After importing your images into a dataset, you might need to add labels to images or correct mislabeled images. For example, images aren't labeled if they are imported from a local computer. You use the dataset gallery to add new labels to the dataset and assign labels and bounding boxes to images in the dataset. 

How you label the images in your datasets determines the type of model that Amazon Rekognition Custom Labels trains. For more information, see [Purposing datasets](md-dataset-purpose.md). 

**Topics**
+ [

# Managing labels
](md-labels.md)
+ [

# Assigning image-level labels to an image
](md-assign-image-level-labels.md)
+ [

# Labeling objects with bounding boxes
](md-localize-objects.md)

# Managing labels


You can manage labels by using the Amazon Rekognition Custom Labels console. There isn't a specific API for managing labels – labels are added to the dataset when you create the dataset with `CreateDataset` or when you add more images to the dataset with `UpdateDatasetEntries`.

**Topics**
+ [

## Managing labels (Console)
](#md-labels-console)
+ [

## Managing Labels (SDK)
](#md-labels-sdk)

## Managing labels (Console)


You can use the Amazon Rekognition Custom Labels console to add, change, or remove labels from a dataset. To add a label to a dataset, you can add a new label that you create or import labels from an existing dataset in Rekognition.

**Topics**
+ [

### Add new labels (Console)
](#md-add-new-labels)
+ [

### Change and remove labels (Console)
](#md-edit-labels-after-adding)

### Add new labels (Console)


You can specify new labels that you want to add to your dataset. 

#### Add labels using the editing window


**To add a new label (console)**

1. Open the Amazon Rekognition console at [https://console.aws.amazon.com/rekognition/](https://console.aws.amazon.com/rekognition/).

1. Choose **Use Custom Labels**.

1. Choose **Get started**. 

1. In the left navigation pane, choose **Projects**.

1. In the **Projects** page, choose the project that you want to use. The details page for your project is displayed.

1. If you want to add labels to your training dataset, choose the **Training** tab. Otherwise choose the **Test** tab to add labels to the test dataset. 

1. Choose **Start labeling** to enter labeling mode.

1. In the **Labels** section of the dataset gallery, choose **Manage labels** to open the **Manage labels** dialog box.

1. In the edit box, enter a new label name.

1. Choose **Add label**.

1. Repeat steps 9 and 10 until you have created all the labels you need.

1. Choose **Save** to save the labels that you added.

### Change and remove labels (Console)


You can rename or remove labels after adding them to a dataset. You can only remove labels that are not assigned to any images.

**To rename or remove an existing label (console)**

1. Open the Amazon Rekognition console at [https://console.aws.amazon.com/rekognition/](https://console.aws.amazon.com/rekognition/).

1. Choose **Use Custom Labels**.

1. Choose **Get started**. 

1. In the left navigation pane, choose **Projects**.

1. In the **Projects** page, choose the project that you want to use. The details page for your project is displayed.

1. If you want to change or delete labels in your training dataset, choose the **Training** tab. Otherwise choose the **Test** tab to change or delete labels to the test dataset. 

1. Choose **Start labeling** to enter labeling mode.

1. In the **Labels** section of the dataset gallery, choose **Manage labels** to open the **Manage labels** dialog box.

1. Choose the label that you want to edit or delete.   
![\[Manage labels dialog box showing a text field to add a new label and an existing label named "test", with options to save or cancel changes.\]](http://docs.aws.amazon.com/rekognition/latest/customlabels-dg/images/change-delete-label.jpg)

   1. If you choose the delete icon (X), the label is removed from the list.

   1. If you want to change the label, choose the edit icon (pencil and paper pad) and enter a new label name in the edit box. 

1. Choose **Save** to save your changes.

## Managing Labels (SDK)


There isn't a unique API that manages dataset labels. If you create a dataset with `CreateDataset`, the labels found in the manifest file or copied dataset, create the initial set of labels. If you add more images with the `UpdateDatasetEntries` API, new labels found in the entries are added to the dataset. For more information, see [Adding more images (SDK)](md-add-images.md#md-add-images-sdk). To delete labels from a dataset, you must remove all label annotations in the dataset.

**To delete labels from a dataset**

1. Call `ListDatasetEntries` to get the dataset entries. For example code, see [Listing dataset entries (SDK)](md-listing-dataset-entries-sdk.md).

1. In the file, remove any label annotations. For more information, see [Importing image-level labels in manifest files](md-create-manifest-file-classification.md) and [Object localization in manifest files](md-create-manifest-file-object-detection.md). 

1. Use the file to update the dataset with the `UpdateDatasetEntries` API. For more information, see [Adding more images (SDK)](md-add-images.md#md-add-images-sdk).

# Assigning image-level labels to an image


You use image-level labels to train models that classify images into categories. An image-level label indicates that an image contains an object, scene or concept. For example, the following image shows a river. If your model classifies images as containing rivers, you would add a *river* image-level label. For more information, see [Purposing datasets](md-dataset-purpose.md). 

![\[Lake reflecting mountains and clouds in still water at sunset or sunrise.\]](http://docs.aws.amazon.com/rekognition/latest/customlabels-dg/images/pateros.jpg)


A dataset that contains image-level labels, needs at least two labels defined. Each image needs at least one assigned label that identifies the object, scene, or concept in the image.

**To assign image-level labels to an image (console)**

1. Open the Amazon Rekognition console at [https://console.aws.amazon.com/rekognition/](https://console.aws.amazon.com/rekognition/).

1. Choose **Use Custom Labels**.

1. Choose **Get started**. 

1. In the left navigation pane, choose **Projects**.

1. In the **Projects** page, choose the project that you want to use. The details page for your project is displayed.

1. In the left navigation pane, choose **Dataset**. 

1. If you want to add labels to your training dataset, choose the **Training** tab. Otherwise choose the **Test** tab to add labels to the test dataset. 

1. Choose **Start labeling** to enter labeling mode.

1. In the image gallery, select one or more images that you want to add labels to. You can only select images on a single page at a time. To select a contiguous range of images on a page:

   1. Select the first image in the range.

   1. Press and hold the shift key.

   1. Select the last image range. The images between the first and second image are also selected. 

   1. Release the shift key.

1. Choose **Assign image-level labels**. 

1. In the **Assign image-level label to selected images** dialog box, select a label that you want to assign to the image or images.

1. Choose **Assign** to assign label to the image.

1. Repeat labeling until every image is annotated with the required labels.

1. Choose **Save changes** to save your changes.

## Assign image-level labels (SDK)


You can use the `UpdateDatasetEntries` API to add or update the image-level labels that are assigned to an image. `UpdateDatasetEntries` takes one or more JSON lines. Each JSON Line represents a single image. For an image with an image-level label, the JSON Line looks similar to the following. 

```
{"source-ref":"s3://custom-labels-console-us-east-1-nnnnnnnnnn/gt-job/manifest/IMG_1133.png","TestCLConsoleBucket":0,"TestCLConsoleBucket-metadata":{"confidence":0.95,"job-name":"labeling-job/testclconsolebucket","class-name":"Echo Dot","human-annotated":"yes","creation-date":"2020-04-15T20:17:23.433061","type":"groundtruth/image-classification"}}
```

The `source-ref` field indicates the location of the image. The JSON line also includes the image-level labels assigned to the image. For more information, see [Importing image-level labels in manifest files](md-create-manifest-file-classification.md).

**To assign image-level labels to an image**

1. Get the get JSON Line for the existing image by using the `ListDatasetEntries`. For the `source-ref` field, specify the location of the image that you want to assign the label to. For more information, see [Listing dataset entries (SDK)](md-listing-dataset-entries-sdk.md). 

1. Update the JSON Line returned in the previous step using the information at [Importing image-level labels in manifest files](md-create-manifest-file-classification.md).

1. Call `UpdateDatasetEntries` to update the image. For more information, see [Adding more images to a dataset](md-add-images.md).

# Labeling objects with bounding boxes


If you want your model to detect the location of objects within an image, you must identify what the object is and where it is in the image. A bounding box is a box that isolates an object in an image. You use bounding boxes to train a model to detect different objects in the same image. You identify the object by assigning a label to the bounding box. 

**Note**  
If you're training a model to find objects, scenes, and concepts with image-level labels, you don't need to do this step.

For example, if you want to train a model that detects Amazon Echo Dot devices, you draw a bounding box around each Echo Dot in an image and assign a label named *Echo Dot* to the bounding box. The following image shows a bounding box around an Echo Dot device. The image also contains an Amazon Echo without a bounding box.

![\[Amazon Echo Dot and Echo devices, with bounding box around Echo Dot.\]](http://docs.aws.amazon.com/rekognition/latest/customlabels-dg/images/dot.jpg)


## Locate objects with bounding boxes (Console)


 In this procedure, you use the console to draw bounding boxes around the objects in your images. You also can identify objects within the image by assigning labels to the bounding box. 

**Note**  
You can't use the Safari browser to add bounding boxes to images. For supported browsers, see [Setting up Amazon Rekognition Custom Labels](setting-up.md).

Before you can add bounding boxes, you must add at least one label to the dataset. For more information, see [Add new labels (Console)](md-labels.md#md-add-new-labels).

****

**To add a bounding boxes to images (console)**

1. Open the Amazon Rekognition console at [https://console.aws.amazon.com/rekognition/](https://console.aws.amazon.com/rekognition/).

1. Choose **Use Custom Labels**.

1. Choose **Get started**. 

1. In the left navigation pane, choose **Projects**.

1. In the **Projects** page, choose the project that you want to use. The details page for your project is displayed.

1. On the project details page, choose **Label images**

1. If you want to add bounding boxes to your training dataset images, choose the **Training** tab. Otherwise choose the **Test** tab to add bounding boxes to the test dataset images. 

1. Choose **Start labeling** to enter labeling mode.

1. In the image gallery, choose the images that you want to add bounding boxes to.

1. Choose **Draw bounding box**. A series of tips are shown before the bounding box editor is displayed.

1. In the **Labels** pane on the right, select the label that you want to assign to a bounding box.

1. In the drawing tool, place your pointer at the top-left area of the desired object.

1. Press the left mouse button and draw a box around the object. Try to draw the bounding box as close as possible to the object. 

1. Release the mouse button. The bounding box is highlighted.

1. Choose **Next** if you have more images to label. Otherwise, choose **Done** to finish labeling.  
![\[UI to draw bounding box around an image, the image is Amazon Echo and Echo Dot smart speakers on a wooden surface.\]](http://docs.aws.amazon.com/rekognition/latest/customlabels-dg/images/draw-bounding-box.png)

1. Repeat steps 1–7 until you have created a bounding box in each image that contains objects. 

1. Choose **Save changes** to save your changes. 

1. Choose **Exit** to exit labeling mode.

## Locate objects with bounding boxes (SDK)


You can use the `UpdateDatasetEntries` API to add or update object location information for an image. `UpdateDatasetEntries` takes one or more JSON lines. Each JSON Line represents a single image. For object localization, a JSON Line looks similar to the following. 

```
{"source-ref": "s3://bucket/images/IMG_1186.png", "bounding-box": {"image_size": [{"width": 640, "height": 480, "depth": 3}], "annotations": [{ "class_id": 1,	"top": 251,	"left": 399, "width": 155, "height": 101}, {"class_id": 0, "top": 65, "left": 86, "width": 220,	"height": 334}]}, "bounding-box-metadata": {"objects": [{ "confidence": 1}, {"confidence": 1}],	"class-map": {"0": "Echo",	"1": "Echo Dot"}, "type": "groundtruth/object-detection", "human-annotated": "yes",	"creation-date": "2013-11-18T02:53:27", "job-name": "my job"}}
```

The `source-ref` field indicates the location of the image. The JSON line also includes labeled bounding boxes for each object on the image. For more information, see [Object localization in manifest files](md-create-manifest-file-object-detection.md).

**To assign bounding boxes to an image**

1. Get the get JSON Line for the existing image by using the `ListDatasetEntries`. For the `source-ref` field, specify the location of the image that you want to assign the image-level label to. For more information, see [Listing dataset entries (SDK)](md-listing-dataset-entries-sdk.md).

1. Update the JSON Line returned in the previous step using the information at [Object localization in manifest files](md-create-manifest-file-object-detection.md).

1. Call `UpdateDatasetEntries` to update the image. For more information, see [Adding more images to a dataset](md-add-images.md).

# Debugging datasets


During dataset creation there are two types of error that can occur — *terminal errors* and *non-terminal errors*. Terminal errors can stop dataset creation or update. Non-terminal errors don't stop dataset creation or update.

**Topics**
+ [

# Debugging terminal dataset errors
](debugging-datasets-terminal-errors.md)
+ [

# Debugging non-terminal dataset errors
](debugging-datasets-non-terminal-errors.md)

# Debugging terminal dataset errors


 There are two types of terminal errors — file errors that cause dataset creation to fail, and content errors that Amazon Rekognition Custom Labels removes from the dataset. Dataset creation fails if there are too many content errors.

**Topics**
+ [

## Terminal file errors
](#debugging-datasets-terminal-file-errors)
+ [

## Terminal content errors
](#debugging-datasets-terminal-content-errors)

## Terminal file errors


The following are file errors. You can get information about file errors by calling `DescribeDataset` and checking the `Status` and `StatusMessage` fields. For example code, see [Describing a dataset (SDK)](md-describing-dataset-sdk.md).
+ [ERROR\$1MANIFEST\$1INACCESSIBLE\$1OR\$1UNSUPPORTED\$1FORMAT](#md-error-status-ERROR_MANIFEST_INACCESSIBLE_OR_UNSUPPORTED_FORMAT)
+ [ERROR\$1MANIFEST\$1SIZE\$1TOO\$1LARGE](#md-error-status-ERROR_MANIFEST_SIZE_TOO_LARGE).
+ [ERROR\$1MANIFEST\$1ROWS\$1EXCEEDS\$1MAXIMUM](#md-error-status-ERROR_MANIFEST_ROWS_EXCEEDS_MAXIMUM)
+ [ERROR\$1INVALID\$1PERMISSIONS\$1MANIFEST\$1S3\$1BUCKET](#md-error-status-ERROR_INVALID_PERMISSIONS_MANIFEST_S3_BUCKET)
+ [ERROR\$1TOO\$1MANY\$1RECORDS\$1IN\$1ERROR](#md-error-status-ERROR_TOO_MANY_RECORDS_IN_ERROR)
+ [ERROR\$1MANIFEST\$1TOO\$1MANY\$1LABELS](#md-error-status-ERROR_MANIFEST_TOO_MANY_LABELS)
+ [ERROR\$1INSUFFICIENT\$1IMAGES\$1PER\$1LABEL\$1FOR\$1DISTRIBUTE](#md-error-status-ERROR_INSUFFICIENT_IMAGES_PER_LABEL_FOR_DISTRIBUTE)

### ERROR\$1MANIFEST\$1INACCESSIBLE\$1OR\$1UNSUPPORTED\$1FORMAT


#### Error message


The manifest file extension or contents are invalid.

The training or testing manifest file doesn't have a file extension or its contents are invalid. 

**To fix error *ERROR\$1MANIFEST\$1INACCESSIBLE\$1OR\$1UNSUPPORTED\$1FORMAT***
+ Check the following possible causes in both the training and testing manifest files.
  + The manifest file is missing a file extension. By convention the file extension is `.manifest`.
  +  The Amazon S3 bucket or key for the manifest file couldn't be found.

### ERROR\$1MANIFEST\$1SIZE\$1TOO\$1LARGE


#### Error message


The manifest file size exceeds the maximum supported size.

The training or testing manifest file size (in bytes) is too large. For more information, see [Guidelines and quotas in Amazon Rekognition Custom Labels](limits.md). A manifest file can have less than the maximum number of JSON Lines and still exceed the maximum file size.

You can't use the Amazon Rekognition Custom Labels console to fix error *The manifest file size exceeds the maximum supported size*.

**To fix error *ERROR\$1MANIFEST\$1SIZE\$1TOO\$1LARGE***

1. Check which of the training and testing manifests exceed the maximum file size.

1. Reduce the number of JSON Lines in the manifest files that are too large. For more information, see [Creating a manifest file](md-create-manifest-file.md).

### ERROR\$1MANIFEST\$1ROWS\$1EXCEEDS\$1MAXIMUM


#### Error message


The manifest file has too many rows.

#### More information


The number of JSON Lines (number of images) in the manifest file is greater than the allowed limit. The limit is different for image-level models and object location models. For more information, see [Guidelines and quotas in Amazon Rekognition Custom Labels](limits.md). 

JSON Line error are validated until the number of JSON Lines reaches the `ERROR_MANIFEST_ROWS_EXCEEDS_MAXIMUM` limit. 

You can't use the Amazon Rekognition Custom Labels console to fix error `ERROR_MANIFEST_ROWS_EXCEEDS_MAXIMUM`.

**To fix `ERROR_MANIFEST_ROWS_EXCEEDS_MAXIMUM`**
+ Reduce the number of JSON Lines in the manifest. For more information, see [Creating a manifest file](md-create-manifest-file.md).



### ERROR\$1INVALID\$1PERMISSIONS\$1MANIFEST\$1S3\$1BUCKET


#### Error message


The S3 bucket permissions are incorrect.

Amazon Rekognition Custom Labels doesn't have permissions to one or more of the buckets containing the training and testing manifest files. 

You can't use the Amazon Rekognition Custom Labels console to fix this error.

**To fix error *ERROR\$1INVALID\$1PERMISSIONS\$1MANIFEST\$1S3\$1BUCKET***
+ Check the permissions for the bucket(s) containing the training and testing manifests. For more information, see [Step 2: Set up Amazon Rekognition Custom Labels console permissions](su-console-policy.md).

### ERROR\$1TOO\$1MANY\$1RECORDS\$1IN\$1ERROR


#### Error message


 The manifest file has too many terminal errors.

**To fix `ERROR_TOO_MANY_RECORDS_IN_ERROR`**
+ Reduce the number of JSON Lines (images) with terminal content errors. For more information, see [Terminal manifest content errors](tm-debugging-aggregate-errors.md). 

You can't use the Amazon Rekognition Custom Labels console to fix this error.

### ERROR\$1MANIFEST\$1TOO\$1MANY\$1LABELS


#### Error message


The manifest file has too many labels.

##### More information


The number of unique labels in the manifest (dataset) is more than the allowed limit. If the training dataset is split to create a testing dataset, the mumber of labels is determined after the split. 

**To fix ERROR\$1MANIFEST\$1TOO\$1MANY\$1LABELS (Console)**
+ Remove labels from the dataset. For more information, see [Managing labels](md-labels.md). The labels are automatically removed from the images and bounding boxes in your dataset.



**To fix ERROR\$1MANIFEST\$1TOO\$1MANY\$1LABELS (JSON Line)**
+ Manifests with image level JSON Lines – If the image has a single label, remove the JSON Lines for images that use the desired label. If the JSON Line contains multiple labels, remove only the JSON object for the desired label. For more information, see [Adding multiple image-level labels to an image](md-create-manifest-file-classification.md#md-dataset-purpose-classification-multiple-labels). 

  Manifests with object location JSON Lines – Remove the bounding box and associated label information for the label that you want to remove. Do this for each JSON Line that contains the desired label. You need to remove the label from the `class-map` array and corresponding objects in the `objects` and `annotations` array. For more information, see [Object localization in manifest files](md-create-manifest-file-object-detection.md).

### ERROR\$1INSUFFICIENT\$1IMAGES\$1PER\$1LABEL\$1FOR\$1DISTRIBUTE


#### Error message


The manifest file doesn't have enough labeled images to distribute the dataset.



Dataset distribution occurs when Amazon Rekognition Custom Labels splits a training dataset to create a test dataset. You can also split a dataset by calling the `DistributeDatasetEntries` API.

**To fix error *ERROR\$1MANIFEST\$1TOO\$1MANY\$1LABELS***
+ Add more labeled images to the training dataset

## Terminal content errors


The following are terminal content errors. During dataset creation, images that have terminal content errors are removed from the dataset. The dataset can still be used for training. If there are too many content errors, dataset/update fails. Terminal content errors related to dataset operations aren't displayed in the console or returned from `DescribeDataset` or other API. If you notice that images or annotations are missing from your datasets, check your dataset manifest files for the following issues: 
+ The length of a JSON line is too long. The maximum length is 100,000 characters.
+ The `source-ref` value is missing from a JSON Line.
+ The format of a `source-ref` value in a JSON Line is invalid.
+ The contents of a JSON Line are not valid.
+ The value a `source-ref` field appears more than once. An image can only be referenced once in a dataset.

For information about the `source-ref` field, see [Creating a manifest file](md-create-manifest-file.md). 

# Debugging non-terminal dataset errors


The following are non-terminal errors that can occur during dataset creation or update. These errors can invalidate an entire JSON Line or invalidate annotations within a JSON Line. If a JSON Line has an error, it is not used for training. If an annotation within a JSON Line has an error, the JSON Line is still used for training, but without the broken annotation. For more information about JSON Lines, see [Creating a manifest file](md-create-manifest-file.md).

You can access non-terminal errors from the console and by calling the `ListDatasetEntries` API. For more information, see [Listing dataset entries (SDK)](md-listing-dataset-entries-sdk.md).

The following errors are are also returned during training. We recommend that you fix these errors before training your model.For more information, see [Non-Terminal JSON Line Validation Errors](tm-debugging-json-line-errors.md).
+ [ERROR\$1NO\$1LABEL\$1ATTRIBUTES](tm-debugging-json-line-errors.md#tm-error-ERROR_NO_LABEL_ATTRIBUTES)
+ [ERROR\$1INVALID\$1LABEL\$1ATTRIBUTE\$1FORMAT](tm-debugging-json-line-errors.md#tm-error-ERROR_INVALID_LABEL_ATTRIBUTE_FORMAT)
+ [ERROR\$1INVALID\$1LABEL\$1ATTRIBUTE\$1METADATA\$1FORMAT](tm-debugging-json-line-errors.md#tm-error-ERROR_INVALID_LABEL_ATTRIBUTE_METADATA_FORMAT)
+ [ERROR\$1NO\$1VALID\$1LABEL\$1ATTRIBUTES](tm-debugging-json-line-errors.md#tm-error-ERROR_NO_VALID_LABEL_ATTRIBUTES)
+ [ERROR\$1INVALID\$1BOUNDING\$1BOX](tm-debugging-json-line-errors.md#tm-error-ERROR_INVALID_BOUNDING_BOX)
+ [ERROR\$1INVALID\$1IMAGE\$1DIMENSION](tm-debugging-json-line-errors.md#tm-error-ERROR_INVALID_IMAGE_DIMENSION)
+ [ERROR\$1BOUNDING\$1BOX\$1TOO\$1SMALL](tm-debugging-json-line-errors.md#tm-error-ERROR_BOUNDING_BOX_TOO_SMALL)
+ [ERROR\$1NO\$1VALID\$1ANNOTATIONS](tm-debugging-json-line-errors.md#tm-error-ERROR_NO_VALID_ANNOTATIONS)
+ [ERROR\$1MISSING\$1BOUNDING\$1BOX\$1CONFIDENCE](tm-debugging-json-line-errors.md#tm-error-ERROR_MISSING_BOUNDING_BOX_CONFIDENCE)
+ [ERROR\$1MISSING\$1CLASS\$1MAP\$1ID](tm-debugging-json-line-errors.md#tm-error-ERROR_MISSING_CLASS_MAP_ID)
+ [ERROR\$1TOO\$1MANY\$1BOUNDING\$1BOXES](tm-debugging-json-line-errors.md#tm-error-ERROR_TOO_MANY_BOUNDING_BOXES)
+ [ERROR\$1UNSUPPORTED\$1USE\$1CASE\$1TYPE](tm-debugging-json-line-errors.md#tm-error-ERROR_UNSUPPORTED_USE_CASE_TYPE)
+ [ERROR\$1INVALID\$1LABEL\$1NAME\$1LENGTH](tm-debugging-json-line-errors.md#tm-error-ERROR_INVALID_LABEL_NAME_LENGTH)

## Accessing non-terminal errors


You can use the console to find out which images in a dataset have non-terminal errors. You can also call, call `ListDatasetEntries` API to get the error messages. For more information, see [Listing dataset entries (SDK)](md-listing-dataset-entries-sdk.md). 

**To access non-terminal errors(console)**

1. Open the Amazon Rekognition console at [https://console.aws.amazon.com/rekognition/](https://console.aws.amazon.com/rekognition/).

1. Choose **Use Custom Labels**.

1. Choose **Get started**. 

1. In the left navigation pane, choose **Projects**.

1. In the **Projects** page, choose the project that you want to use. The details page for your project is displayed.

1. If you want to view non-terminal errors in your training dataset, choose the **Training** tab. Otherwise choose the **Test** tab to view non-terminal errors in your test dataset. 

1. In the **Labels** section of the dataset gallery, choose **Errors**. The dataset gallery is filtered to only show images with errors.

1. Choose **Error** underneath an image to see the error code. Use the information at [Non-Terminal JSON Line Validation Errors](tm-debugging-json-line-errors.md) to fix the error.  
![\[Error dialog showing "ERROR_UNSUPPORTED_USE_CASE_TYPE" and "ERROR_NO_VALID_LABEL_ATTRIBUTES" under "Dataset record errors".\]](http://docs.aws.amazon.com/rekognition/latest/customlabels-dg/images/dataset-non-terminal-error.jpg)

# Training an Amazon Rekognition Custom Labels model
Training a model

You can train a model by using the Amazon Rekognition Custom Labels console, or by the Amazon Rekognition Custom Labels API. If model training fails, use the information in [Debugging a failed model training](tm-debugging.md) to find the cause of the failure.

**Note**  
You are charged for the amount of time that it takes to successfully train a model. Typically training takes from 30 minutes to 24 hours to complete. For more information, see [Training hours](https://aws.amazon.com/rekognition/pricing/#Amazon_Rekognition_Custom_Labels_pricing). 

A new version of a model is created every time the model is trained. Amazon Rekognition Custom Labels creates a name for the model that is a combination of the project name and the timestamp for when the model is created. 

To train your model, Amazon Rekognition Custom Labels makes a copy of your source training and test images. By default the copied images are encrypted at rest with a key that AWS owns and manages. You can also choose to use your own AWS KMS key. If you use your own KMS key, you need the following permissions on the KMS key.
+ kms:CreateGrant
+ kms:DescribeKey

For more information, see [AWS Key Management Service concepts](https://docs.aws.amazon.com/kms/latest/developerguide/concepts.html#master_keys). Your source images are unaffected.

You can use KMS server-side encryption (SSE-KMS) to encrypt the training and test images in your Amazon S3 bucket, before they are copied by Amazon Rekognition Custom Labels. To allow Amazon Rekognition Custom Labels access to your images, your AWS account needs the following permissions on the KMS key.
+ kms:GenerateDataKey
+ kms:Decrypt

For more information, see [Protecting Data Using Server-Side Encryption with KMS keys Stored in AWS Key Management Service (SSE-KMS)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingKMSEncryption.html).

After training a model, you can evaluate its performance and make improvements. For more information, see [Improving a trained Amazon Rekognition Custom Labels model](improving-model.md).

For other model tasks, such as tagging a model, see [Managing an Amazon Rekognition Custom Labels model](managing-model.md).

**Topics**
+ [

## Training a model (Console)
](#tm-console)
+ [

## Training a model (SDK)
](#tm-sdk)

## Training a model (Console)


You can use the Amazon Rekognition Custom Labels console to train a model.

Training requires a project with a training dataset and a test dataset. If your project doesn't have a test dataset, the Amazon Rekognition Custom Labels console splits the training dataset during training to create one for your project. The images chosen are a representative sampling and aren't used in the training dataset. We recommend splitting your training dataset only if you don't have an alternative test dataset that you can use. Splitting a training dataset reduces the number of images available for training.

**Note**  
You are charged for the amount of time that it takes to train a model. For more information, see [Training hours](https://aws.amazon.com/rekognition/pricing/#Amazon_Rekognition_Custom_Labels_pricing). 

**To train your model (console)**

1. Open the Amazon Rekognition console at [https://console.aws.amazon.com/rekognition/](https://console.aws.amazon.com/rekognition/).

1. Choose **Use Custom Labels**.

1. In the left navigation pane, choose **Projects**.

1. In the **Projects** page, choose the project that contains the model that you want to train. 

1. On the **Project** page, choose **Train model**.  
![\["Train mode" button for training a machine learning model on the dataset in the current project.\]](http://docs.aws.amazon.com/rekognition/latest/customlabels-dg/images/tutorial-train-model.jpg)

1. (Optional) If you want to use your own AWS KMS encryption key, do the following:

   1. In **Image data encryption** choose **Customize encryption settings (advanced)**.

   1. In **encryption.aws\$1kms\$1key** enter the Amazon Resource Name (ARN) of your key, or choose an existing AWS KMS key. To create a new key, choose **Create an AWS IMS key**.

1. (Optional) if you want to add tags to your model do the following:

   1. In the **Tags** section, choose **Add new tag**.

   1. Enter the following:

      1. The name of the key in **Key**.

      1. The value of the key in **Value**.

   1. To add more tags, repeat steps 6a and 6b.

   1. (Optional) If you want to remove a tag, choose **Remove** next to the tag that you want to remove. If you are removing a previously saved tag, it is removed when you save your changes.

1. On the **Train model** page, Choose **Train model**. The Amazon Resource Name (ARN) for your project should be in the **Choose project** edit box. If not, enter the ARN for your project.  
![\[Train model button to start training an AI model on the Amazon Rekognition Custom Labels service.\]](http://docs.aws.amazon.com/rekognition/latest/customlabels-dg/images/tutorial-train-model-page-train-model.jpg)

1. In the **Do you want to train your model?** dialog box, choose **Train model**.   
![\[Train model configuration page showing Train Model button.\]](http://docs.aws.amazon.com/rekognition/latest/customlabels-dg/images/tutorial-dialog-train-model.jpg)

1. In the **Models** section of the project page, you can check the current status in the `Model Status` column, where the training's in progress. Training a model takes a while to complete.   
![\[Model status showing 'TRAINING_IN_PROGRESS' indicating the model is currently being trained.\]](http://docs.aws.amazon.com/rekognition/latest/customlabels-dg/images/tutorial-training-progress.jpg)

1. After training completes, choose the model name. Training is finished when the model status is **TRAINING\$1COMPLETED**. If training fails, read [Debugging a failed model training](tm-debugging.md).  
![\[Interface showing a trained model and status TRAINING_COMPLETED, indicating the model is ready to run.\]](http://docs.aws.amazon.com/rekognition/latest/customlabels-dg/images/get-started-choose-model.jpg)

1. Next step: Evaluate your model. For more information, [Improving a trained Amazon Rekognition Custom Labels model](improving-model.md).

## Training a model (SDK)


You train a model by calling [CreateProjectVersion](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_CreateProjectVersion). To train a model, the following information is needed:
+ Name – A unique name for the model version.
+ Project ARN – The Amazon Resource Name (ARN) of the project that manages the model.
+ Training results location – The Amazon S3 location where the results are placed. You can use the same location as the console Amazon S3 bucket, or you can choose a different location. We recommend choosing a different location because this allows you to set permissions and avoid potential naming conflicts with training output from using the Amazon Rekognition Custom Labels console.

Training uses the training and test datasets associated with project. For more information, see [Managing datasets](managing-dataset.md). 

**Note**  
Optionally, you can specify training and test dataset manifest files that are external to a project. If you open the console after training a model with external manifest files, Amazon Rekognition Custom Labels creates the datasets for you by using the last set of manifest files used for training. You can no longer train a model version for the project by specifying external manifest files. For more information, see [CreatePrjectVersion](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_CreateProjectVersion). 

The response from `CreateProjectVersion` is an ARN that you use to identify the model version in subsequent requests. You can also use the ARN to secure the model version. For more information, see [Securing Amazon Rekognition Custom Labels projects](sc-introduction.md#sc-resources).

Training a model version takes a while to complete. The Python and Java examples in this topic use waiters to wait for training to complete. A waiter is a utility method that polls for a particular state to occur. Alternatively, you can get the current status of training by calling `DescribeProjectVersions`. Training is completed when the `Status` field value is `TRAINING_COMPLETED`. After training is completed, you can evaluate model’s quality by reviewing the evaluation results. 

### Training a model (SDK)


The following example shows how to train a model by using the training and test datasets associated with a project.

**To train a model (SDK)**

1. If you haven't already done so, install and configure the AWS CLI and the AWS SDKs. For more information, see [Step 4: Set up the AWS CLI and AWS SDKs](su-awscli-sdk.md).

1. Use the following example code to train a project.

------
#### [ AWS CLI ]

   The following example creates a model. The training dataset is split to create the testing dataset. Replace the following:
   + `my_project_arn` with the Amazon Resource Name (ARN) of the project.
   + `version_name` with a unique version name of your choosing.
   + `output_bucket` with the name of the Amazon S3 bucket where Amazon Rekognition Custom Labels saves the training results.
   + `output_folder` with the name of the folder where the training results are saved.
   + (optional parameter) `--kms-key-id` with identifier for your AWS Key Management Service customer master key.

   ```
   aws rekognition create-project-version \
     --project-arn project_arn \
     --version-name version_name \
     --output-config '{"S3Bucket":"output_bucket", "S3KeyPrefix":"output_folder"}' \
     --profile custom-labels-access
   ```

------
#### [ Python ]

   The following example creates a model. Supply the following command line arguments:
   + `project_arn` – The Amazon Resource Name (ARN) of the project.
   + `version_name` – A unique version name for the model of your choosing.
   + `output_bucket` – the name of the Amazon S3 bucket where Amazon Rekognition Custom Labels saves the training results.
   + `output_folder` – the name of the folder where the training results are saved.

   Optionally, supply the folowing command line parameters to attach a tag to your model:
   + `tag` – a tag name of your choosing that you want to attach to the model.
   + `tag_value` the tag value. 

   ```
   #Copyright 2023 Amazon.com, Inc. or its affiliates. All Rights Reserved.
   #PDX-License-Identifier: MIT-0 (For details, see https://github.com/awsdocs/amazon-rekognition-custom-labels-developer-guide/blob/master/LICENSE-SAMPLECODE.)
   
   
   import argparse
   import logging
   import json
   import boto3
   
   from botocore.exceptions import ClientError
   
   logger = logging.getLogger(__name__)
   
   def train_model(rek_client, project_arn, version_name, output_bucket, output_folder, tag_key, tag_key_value):
       """
       Trains an Amazon Rekognition Custom Labels model.
       :param rek_client: The Amazon Rekognition Custom Labels Boto3 client.
       :param project_arn: The ARN of the project in which you want to train a model.
       :param version_name: A version for the model.
       :param output_bucket: The S3 bucket that hosts training output.
       :param output_folder: The path for the training output within output_bucket
       :param tag_key: The name of a tag to attach to the model. Pass None to exclude
       :param tag_key_value: The value of the tag. Pass None to exclude
   
       """
   
       try:
           #Train the model
   
           status=""
           logger.info("training model version %s for project %s",
               version_name, project_arn)
   
   
           output_config = json.loads(
               '{"S3Bucket": "'
               + output_bucket
               + '", "S3KeyPrefix": "'
               + output_folder
               + '" }  '
           )
   
           tags={}
   
           if tag_key is not None and tag_key_value is not None:
               tags = json.loads(
                   '{"' + tag_key + '":"' + tag_key_value + '"}'
               )
   
   
           response=rek_client.create_project_version(
               ProjectArn=project_arn, 
               VersionName=version_name,
               OutputConfig=output_config,
               Tags=tags
           )
   
           logger.info("Started training: %s", response['ProjectVersionArn'])
   
           # Wait for the project version training to complete.
   
           project_version_training_completed_waiter = rek_client.get_waiter('project_version_training_completed')
           project_version_training_completed_waiter.wait(ProjectArn=project_arn,
           VersionNames=[version_name])
       
   
           # Get the completion status.
           describe_response=rek_client.describe_project_versions(ProjectArn=project_arn,
               VersionNames=[version_name])
           for model in describe_response['ProjectVersionDescriptions']:
               logger.info("Status: %s", model['Status'])
               logger.info("Message: %s", model['StatusMessage'])
               status=model['Status']
   
   
           logger.info("finished training")
   
           return response['ProjectVersionArn'], status
       
       except ClientError as err:
           logger.exception("Couldn't create model: %s", err.response['Error']['Message'] )
           raise
   
   def add_arguments(parser):
       """
       Adds command line arguments to the parser.
       :param parser: The command line parser.
       """
   
       parser.add_argument(
           "project_arn", help="The ARN of the project in which you want to train a model"
       )
   
       parser.add_argument(
           "version_name", help="A version name of your choosing."
       )
   
       parser.add_argument(
           "output_bucket", help="The S3 bucket that receives the training results."
       )
   
       parser.add_argument(
           "output_folder", help="The folder in the S3 bucket where training results are stored."
       )
   
       parser.add_argument(
           "--tag_name",  help="The name of a tag to attach to the model", required=False
       )
   
       parser.add_argument(
           "--tag_value",  help="The value for the tag.", required=False
       )
   
   
   
   
   def main():
   
       logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s")
   
       try:
   
           # Get command line arguments.
           parser = argparse.ArgumentParser(usage=argparse.SUPPRESS)
           add_arguments(parser)
           args = parser.parse_args()
   
           print(f"Training model version {args.version_name} for project {args.project_arn}")
   
           # Train the model.
           session = boto3.Session(profile_name='custom-labels-access')
           rekognition_client = session.client("rekognition")
   
           model_arn, status=train_model(rekognition_client, 
               args.project_arn,
               args.version_name,
               args.output_bucket,
               args.output_folder,
               args.tag_name,
               args.tag_value)
   
   
           print(f"Finished training model: {model_arn}")
           print(f"Status: {status}")
   
   
       except ClientError as err:
           logger.exception("Problem training model: %s", err)
           print(f"Problem training model: {err}")
       except Exception as err:
           logger.exception("Problem training model: %s", err)
           print(f"Problem training model: {err}")
   
   
   if __name__ == "__main__":
       main()
   ```

------
#### [ Java V2 ]

   The following example trains a model. Supply the following command line arguments:
   + `project_arn` – The Amazon Resource Name (ARN) of the project.
   + `version_name` – A unique version name for the model of your choosing.
   + `output_bucket` – the name of the Amazon S3 bucket where Amazon Rekognition Custom Labels saves the training results.
   + `output_folder` – the name of the folder where the training results are saved.

   ```
   /*
      Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
      SPDX-License-Identifier: Apache-2.0
   */
   package com.example.rekognition;
   
   import software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider;
   import software.amazon.awssdk.core.waiters.WaiterResponse;
   import software.amazon.awssdk.regions.Region;
   import software.amazon.awssdk.services.rekognition.RekognitionClient;
   import software.amazon.awssdk.services.rekognition.model.CreateProjectVersionRequest;
   import software.amazon.awssdk.services.rekognition.model.CreateProjectVersionResponse;
   import software.amazon.awssdk.services.rekognition.model.DescribeProjectVersionsRequest;
   import software.amazon.awssdk.services.rekognition.model.DescribeProjectVersionsResponse;
   import software.amazon.awssdk.services.rekognition.model.OutputConfig;
   import software.amazon.awssdk.services.rekognition.model.ProjectVersionDescription;
   import software.amazon.awssdk.services.rekognition.model.RekognitionException;
   import software.amazon.awssdk.services.rekognition.waiters.RekognitionWaiter;
   
   import java.util.Optional;
   import java.util.logging.Level;
   import java.util.logging.Logger;
   
   public class TrainModel {
   
       public static final Logger logger = Logger.getLogger(TrainModel.class.getName());
   
       public static String trainMyModel(RekognitionClient rekClient, String projectArn, String versionName,
               String outputBucket, String outputFolder) {
   
           try {
   
               OutputConfig outputConfig = OutputConfig.builder().s3Bucket(outputBucket).s3KeyPrefix(outputFolder).build();
   
               logger.log(Level.INFO, "Training Model for project {0}", projectArn);
               CreateProjectVersionRequest createProjectVersionRequest = CreateProjectVersionRequest.builder()
                       .projectArn(projectArn).versionName(versionName).outputConfig(outputConfig).build();
   
               CreateProjectVersionResponse response = rekClient.createProjectVersion(createProjectVersionRequest);
   
               logger.log(Level.INFO, "Model ARN: {0}", response.projectVersionArn());
               logger.log(Level.INFO, "Training model...");
   
               // wait until training completes
   
               DescribeProjectVersionsRequest describeProjectVersionsRequest = DescribeProjectVersionsRequest.builder()
                       .versionNames(versionName)
                       .projectArn(projectArn)
                       .build();
   
               RekognitionWaiter waiter = rekClient.waiter();
   
               WaiterResponse<DescribeProjectVersionsResponse> waiterResponse = waiter
                       .waitUntilProjectVersionTrainingCompleted(describeProjectVersionsRequest);
   
               Optional<DescribeProjectVersionsResponse> optionalResponse = waiterResponse.matched().response();
   
               DescribeProjectVersionsResponse describeProjectVersionsResponse = optionalResponse.get();
   
               for (ProjectVersionDescription projectVersionDescription : describeProjectVersionsResponse
                       .projectVersionDescriptions()) {
                   System.out.println("ARN: " + projectVersionDescription.projectVersionArn());
                   System.out.println("Status: " + projectVersionDescription.statusAsString());
                   System.out.println("Message: " + projectVersionDescription.statusMessage());
               }
   
               return response.projectVersionArn();
   
           } catch (RekognitionException e) {
               logger.log(Level.SEVERE, "Could not train model: {0}", e.getMessage());
               throw e;
           }
   
       }
   
       public static void main(String args[]) {
   
           String versionName = null;
           String projectArn = null;
           String projectVersionArn = null;
           String bucket = null;
           String location = null;
   
           final String USAGE = "\n" + "Usage: " + "<project_name> <version_name> <output_bucket> <output_folder>\n\n" + "Where:\n"
                   + "   project_arn - The ARN of the project that you want to use. \n\n"
                   + "   version_name - A version name for the model.\n\n"
                   + "   output_bucket - The S3 bucket in which to place the training output. \n\n"
                   + "   output_folder - The folder within the bucket that the training output is stored in. \n\n";
   
           if (args.length != 4) {
               System.out.println(USAGE);
               System.exit(1);
           }
   
           projectArn = args[0];
           versionName = args[1];
           bucket = args[2];
           location = args[3];
   
           try {
   
               // Get the Rekognition client.
               RekognitionClient rekClient = RekognitionClient.builder()
               .credentialsProvider(ProfileCredentialsProvider.create("custom-labels-access"))
               .region(Region.US_WEST_2)
               .build();
   
   
               // Train model
               projectVersionArn = trainMyModel(rekClient, projectArn, versionName, bucket, location);
   
               System.out.println(String.format("Created model: %s for Project ARN: %s", projectVersionArn, projectArn));
   
               rekClient.close();
   
           } catch (RekognitionException rekError) {
               logger.log(Level.SEVERE, "Rekognition client error: {0}", rekError.getMessage());
               System.exit(1);
           }
   
       }
   
   }
   ```

------

1. If training fails, read [Debugging a failed model training](tm-debugging.md). 

# Debugging a failed model training
Debugging model training

You might encounter errors during model training. Amazon Rekognition Custom Labels reports training errors in the console and in the response from [DescribeProjectVersions](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_DescribeProjectVersions).

Errors are either terminal (training can't continue), or they are non-terminal (training can continue). For errors that relate to the contents of the training and testing datasets, you can download the validation results ( a [manifest summary](tm-debugging-summary.md) and [training and testing validation manifests](tm-debugging-scope-json-line.md)). Use the error codes in the validation results to find further information in this section. This section also provides information for manifest file errors (terminal errors that happen before the manifest file contents are validated). 

**Note**  
A manifest is the file used to store the contents of a dataset.

You can fix some errors by using the Amazon Rekognition Custom Labels console. Other errors might require you to make updates to the training or testing manifest files. You might need to make other changes, such as IAM permissions. For more information, see the documentation for individual errors.

## Terminal errors


Terminal errors stop the training of a model. There are 3 categories of terminal training errors – service errors, manifest file errors, and manifest content errors. 

In the console, Amazon Rekognition Custom Labels shows terminal errors for a model in the **Status message** column of the projects page. The project management dashboard showing list of projects with name, versions, date created, model performance, and status message indicating model state such as training completed or failed

![\[A screenshot of the Project management dashboard.\]](http://docs.aws.amazon.com/rekognition/latest/customlabels-dg/images/terminal-errors.png)


If you using the AWS SDK, you can find out if a terminal manifest file error or a terminal manifest content error has occured by checking the response from [DescribeProjectVersions](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_DescribeProjectVersions). In this case, the `Status` value is `TRAINING_FAILED` and `StatusMessage` field contains the error. 

### Service errors


Terminal service errors occur when Amazon Rekognition experiences a service issue and can't continue training. For example, the failure of another service that Amazon Rekognition Custom Labels depends upon. Amazon Rekognition Custom Labels reports service errors in the console as *Amazon Rekognition experienced a service issue*. If you use the AWS SDK, service errors that occur during training are raised as an `InternalServerError` exception by [CreateProjectVersion](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_CreateProjectVersion) and [DescribeProjectVersions](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_DescribeProjectVersions).

If a service error occurs, retry training of the model. If training continues to fail, contact *[AWS Support](https://aws.amazon.com/premiumsupport/)* and include any error information reported with the service error. 

### List of terminal manifest file errors


Manifest file errors are terminal errors, in the training and testing datasets, that happen at the file level, or across multiple files. Manifest file errors are detected before the contents of the training and testing datasets are validated. Manifest file errors prevent the reporting of [non-terminal validation errors](#tm-error-category-non-terminal-errors). For example, an empty training manifest file generates an *The manifest file is empty* error. Since the file is empty, no non-terminal JSON Line validation errors can be reported. The manifest summary is also not created. 

You must fix manifest file errors before you can train your model. 

The following lists the manifest file errors.
+ [The manifest file extension or contents are invalid.](tm-terminal-errors-reference.md#tm-error-message-ERROR_MANIFEST_INACCESSIBLE_OR_UNSUPPORTED_FORMAT)
+ [The manifest file is empty.](tm-terminal-errors-reference.md#tm-error-message-ERROR_EMPTY_MANIFEST)
+ [The manifest file size exceeds the maximum supported size.](tm-terminal-errors-reference.md#tm-error-message-ERROR_MANIFEST_SIZE_TOO_LARGE)
+ [Unable to write to output S3 bucket.](tm-terminal-errors-reference.md#tm-error-message-ERROR_CANNOT_WRITE_OUTPUT_S3_BUCKET)
+ [The S3 bucket permissions are incorrect.](tm-terminal-errors-reference.md#tm-error-message-ERROR_INVALID_PERMISSIONS_MANIFEST_S3_BUCKET)

### List of terminal manifest content errors


Manifest content errors are terminal errors that relate to the content within a manifest. For example, if you get the error [The manifest file contains insufficient labeled images per label to perform auto-split](tm-debugging-aggregate-errors.md#tm-error-message-ERROR_INSUFFICIENT_IMAGES_PER_LABEL_FOR_AUTOSPLIT), training can't finish as there aren't enough labeled images in the training dataset to create a testing dataset. 

As well as being reported in the console and in the response from `DescribeProjectVersions`, the error is reported in the manifest summary along with any other terminal manifest content errors. For more information, see [Understanding the manifest summary](tm-debugging-summary.md).

Non terminal JSON Line errors are also reported in separate training and testing validation results manifests. The non-terminal JSON Line errors found by Amazon Rekognition Custom Labels are not necessarily related to the manifest content error(s) that stop training. For more information, see [Understanding training and testing validation result manifests](tm-debugging-scope-json-line.md). 

You must fix manifest content errors before you can train your model. 

The following are the error messages for manifest content errors. 
+ [The manifest file contains too many invalid rows.](tm-debugging-aggregate-errors.md#tm-error-message-ERROR_TOO_MANY_INVALID_ROWS_IN_MANIFEST)
+ [The manifest file contains images from multiple S3 buckets.](tm-debugging-aggregate-errors.md#tm-error-message-ERROR_IMAGES_IN_MULTIPLE_S3_BUCKETS)
+ [Invalid owner id for images S3 bucket.](tm-debugging-aggregate-errors.md#tm-error-message-ERROR_INVALID_IMAGES_S3_BUCKET_OWNER)
+ [The manifest file contains insufficient labeled images per label to perform auto-split.](tm-debugging-aggregate-errors.md#tm-error-message-ERROR_INSUFFICIENT_IMAGES_PER_LABEL_FOR_AUTOSPLIT)
+ [The manifest file has too few labels.](tm-debugging-aggregate-errors.md#tm-error-message-ERROR_MANIFEST_TOO_FEW_LABELS)
+ [The manifest file has too many labels.](tm-debugging-aggregate-errors.md#tm-error-message-ERROR_MANIFEST_TOO_MANY_LABELS)
+ [Less than \$1\$1% label overlap between the training and testing manifest files.](tm-debugging-aggregate-errors.md#tm-error-message-ERROR_INSUFFICIENT_LABEL_OVERLAP)
+ [The manifest file has too few usable labels.](tm-debugging-aggregate-errors.md#tm-error-message-ERROR_MANIFEST_TOO_FEW_USABLE_LABELS)
+ [Less than \$1\$1% usable label overlap between the training and testing manifest files.](tm-debugging-aggregate-errors.md#tm-error-message-ERROR_INSUFFICIENT_USABLE_LABEL_OVERLAP)
+ [Failed to copy images from S3 bucket.](tm-debugging-aggregate-errors.md#tm-error-message-ERROR_FAILED_IMAGES_S3_COPY)

## List of non-terminal JSON line validation errors


JSON Line validation errors are non-terminal errors that don't require Amazon Rekognition Custom Labels to stop training a model.

JSON Line validation errors are not shown in the console. 

In the training and testing datasets, a JSON Line represents the training or testing information for a single image. Validation errors in a JSON Line, such as an invalid image, are reported in the training and testing validation manifests. Amazon Rekognition Custom Labels completes training using the other, valid, JSON Lines that are in the manifest. For more information, see [Understanding training and testing validation result manifests](tm-debugging-scope-json-line.md). For information about validation rules, see [Validation rules for manifest files](md-create-manifest-file-validation-rules.md).

**Note**  
Training fails if there are too many JSON Line errors.

We recommend that you also fix non-terminal JSON Line errors errors as they can potentially cause future errors or impact your model training.

Amazon Rekognition Custom Labels can generate the following non-terminal JSON Line validation errors.
+ [The source-ref key is missing.](tm-debugging-json-line-errors.md#tm-error-ERROR_MISSING_SOURCE_REF)
+ [The format of the source-ref value is invalid. ](tm-debugging-json-line-errors.md#tm-error-ERROR_INVALID_SOURCE_REF_FORMAT)
+ [No label attributes found.](tm-debugging-json-line-errors.md#tm-error-ERROR_NO_LABEL_ATTRIBUTES)
+ [The format of the label attribute \$1\$1 is invalid.](tm-debugging-json-line-errors.md#tm-error-ERROR_INVALID_LABEL_ATTRIBUTE_FORMAT)
+ [The format of the label attributemetadata is invalid.](tm-debugging-json-line-errors.md#tm-error-ERROR_INVALID_LABEL_ATTRIBUTE_METADATA_FORMAT)
+ [No valid label attributes found.](tm-debugging-json-line-errors.md#tm-error-ERROR_NO_VALID_LABEL_ATTRIBUTES)
+ [One or more bounding boxes has a missing confidence value.](tm-debugging-json-line-errors.md#tm-error-ERROR_MISSING_BOUNDING_BOX_CONFIDENCE)
+ [One of more class ids is missing from the class map.](tm-debugging-json-line-errors.md#tm-error-ERROR_MISSING_CLASS_MAP_ID)
+ [The JSON Line has an invalid format.](tm-debugging-json-line-errors.md#tm-error-ERROR_INVALID_JSON_LINE)
+ [The image is invalid. Check S3 path and/or image properties.](tm-debugging-json-line-errors.md#tm-error-ERROR_INVALID_IMAGE)
+ [The bounding box has off frame values.](tm-debugging-json-line-errors.md#tm-error-ERROR_INVALID_BOUNDING_BOX)
+ [The height and width of the bounding box is too small.](tm-debugging-json-line-errors.md#tm-error-ERROR_BOUNDING_BOX_TOO_SMALL)
+ [There are more bounding boxes than the allowed maximum.](tm-debugging-json-line-errors.md#tm-error-ERROR_TOO_MANY_BOUNDING_BOXES)
+ [No valid annotations found.](tm-debugging-json-line-errors.md#tm-error-ERROR_NO_VALID_ANNOTATIONS)

# Understanding the manifest summary


The manifest summary contains the following information.
+ Error information about [List of terminal manifest content errors](tm-debugging.md#tm-error-category-combined-terminal) encountered during validation. 
+ Error location information for [List of non-terminal JSON line validation errors](tm-debugging.md#tm-error-category-non-terminal-errors) in the training and testing datasets.
+ Error statistics such as the total number of invalid JSON Lines found in the training and testing datasets. 

The manifest summary is created during training if there are no [List of terminal manifest file errors](tm-debugging.md#tm-error-category-terminal). To get the location of the manifest summary file (*manifest\$1summary.json*), see [Getting the validation results](tm-debugging-getting-validation-data.md).

**Note**  
[Service errors](tm-debugging.md#tm-error-category-service) and [manifest file errors](tm-debugging.md#tm-error-category-terminal) are not reported in the manifest summary. For more information, see [Terminal errors](tm-debugging.md#tm-error-categories-terminal). 

For information about specific manifest content errors, see [Terminal manifest content errors](tm-debugging-aggregate-errors.md).

## Manifest summary file format


A manifest file has 2 sections, `statistics` and `errors`.

### statistics


`statistics` contains information about the errors in the training and testing datasets.
+ `training` – statistics and errors found in the training dataset. 
+ `testing` – statistics and errors found in the testing dataset.



Objects in the `errors` array contain the error code and message for manifest content errors. ``

The `error_line_indices` array contains the line numbers for each JSON Line in the training or test manifest that has an error. For more information, see [Fixing training errors](tm-debugging-fixing-validation-errors.md). 

### errors


Errors spanning both the training and testing dataset. For example, an [ERROR\$1INSUFFICIENT\$1USABLE\$1LABEL\$1OVERLAP](tm-debugging-aggregate-errors.md#tm-error-ERROR_INSUFFICIENT_USABLE_LABEL_OVERLAP) occurs when there is isn't enough usable labels that overlap the training and testing datasets.

```
{
    "statistics": {
        "training": 
            {
                "use_case": String, # Possible values are IMAGE_LEVEL_LABELS, OBJECT_LOCALIZATION and NOT_DETERMINED
                "total_json_lines": Number,   # Total number json lines (images) in the  training manifest.
                "valid_json_lines": Number,   # Total number of JSON Lines (images) that can be used for training.
                "invalid_json_lines": Number, # Total number of invalid JSON Lines. They are not used for training.
                "ignored_json_lines": Number, # JSON Lines that have a valid schema but have no annotations. The aren't used for training and aren't counted as invalid.
                "error_json_line_indices": List[int], # Contains a list of line numbers for JSON line errors in the training dataset.
                "errors": [
                    {
                        "code": String, # Error code for a training manifest content error.
                        "message": String # Description for a training manifest content error.
                    }
                ]
            },
        "testing": 
            {
                "use_case": String, # Possible values are IMAGE_LEVEL_LABELS, OBJECT_LOCALIZATION and NOT_DETERMINED
                "total_json_lines": Number, # Total number json lines (images) in the manifest.
                "valid_json_lines": Number,  # Total number of JSON Lines (images) that can be used for testing.
                "invalid_json_lines": Number, # Total number of invalid JSON Lines. They are not used for testing.
                "ignored_json_lines": Number, # JSON Lines that have a valid schema but have no annotations. They aren't used for testing and aren't counted as invalid.
                "error_json_line_indices": List[int], # contains a list of error record line numbers in testing dataset.
                "errors": [
                    {
                        "code": String,   # # Error code for a testing manifest content error.
                        "message": String # Description for a testing manifest content error.
                    }
                ]  
            }
    },
    "errors": [
        {
            "code": String, # # Error code for errors that span the training and testing datasets.
            "message": String # Description of the error.
        }
    ]
}
```

## Example manifest summary


The following example is a partial manifest summary that shows a terminal manifest content error ([ERROR\$1TOO\$1MANY\$1INVALID\$1ROWS\$1IN\$1MANIFEST](tm-debugging-aggregate-errors.md#tm-error-ERROR_TOO_MANY_INVALID_ROWS_IN_MANIFEST)). The `error_json_line_indices` array contains the line numbers of non-terminal JSON Line errors in the corresponding training or testing validation manifest.

```
{
    "errors": [],
    "statistics": {
        "training": {
            "use_case": "NOT_DETERMINED",
            "total_json_lines": 301,
            "valid_json_lines": 146,
            "invalid_json_lines": 155,
            "ignored_json_lines": 0,
            "errors": [
                {
                    "code": "ERROR_TOO_MANY_INVALID_ROWS_IN_MANIFEST",
                    "message": "The manifest file contains too many invalid rows."
                }
            ],
            "error_json_line_indices": [ 
                15,
                16,
                17,
                22,
                23,
                24,
                 .
                 .
                 .
                 .                 
                300
            ]
        },
        "testing": {
            "use_case": "NOT_DETERMINED",
            "total_json_lines": 15,
            "valid_json_lines": 13,
            "invalid_json_lines": 2,
            "ignored_json_lines": 0,
            "errors": [],
            "error_json_line_indices": [ 
                13,
                15
            ]
        }
    }
}
```

# Understanding training and testing validation result manifests


During training, Amazon Rekognition Custom Labels creates validation result manifests to hold non-terminal JSON Line errors. The validation results manifests are copies of the training and testing datasets with error information added. You can access the validation manifests after training completes. For more information, see [Getting the validation results](tm-debugging-getting-validation-data.md). Amazon Rekognition Custom Labels also creates a manifest summary that includes overview information for JSON Line errors, such as error locations and JSON Line error counts. For more information, see [Understanding the manifest summary](tm-debugging-summary.md).

**Note**  
Validation results (Training and Testing Validation Result Manifests and Manifest Summary) are only created if there are no [List of terminal manifest file errors](tm-debugging.md#tm-error-category-terminal).

A manifest contains JSON Lines for each image in the dataset. Within the validation results manifests, JSON Line error information is added to the JSON Lines where errors occur.

A JSON Line error is a non-terminal error related to a single image. A non-terminal validation error can invalidate the entire JSON Line or just a portion. For example, if the image referenced in a JSON Line is not in PNG or JPG format, an [ERROR\$1INVALID\$1IMAGE](tm-debugging-json-line-errors.md#tm-error-ERROR_INVALID_IMAGE) error occurs and the entire JSON Line is excluded from training. Training continues with other valid JSON Lines.

Within a JSON Line, an error might mean the JSON Line can stil be used for training. For example, if the left value for one of four bounding boxes associated with a label is negative, the model is still trained using the other valid bounding boxes. JSON Line error information is returned for the invalid bounding box ([ERROR\$1INVALID\$1BOUNDING\$1BOX](tm-debugging-json-line-errors.md#tm-error-ERROR_INVALID_BOUNDING_BOX)). In this example, the error information is added to the `annotation` object where the error occurs. 

Warning errors, such as [WARNING\$1NO\$1ANNOTATIONS](tm-debugging-json-line-errors.md#tm-warning-WARNING_NO_ANNOTATIONS), aren't used for training and count as ignored JSON lines (`ignored_json_lines`) in the manifest summary. For more information, see [Understanding the manifest summary](tm-debugging-summary.md). Additionally, ignored JSON Lines don't count towards the 20% error threshold for training and testing.

 For information about specific non-terminal data validation errors, see [Non-Terminal JSON Line Validation Errors](tm-debugging-json-line-errors.md). 

**Note**  
If there are too many data validation errors, training is stopped and a [ERROR\$1TOO\$1MANY\$1INVALID\$1ROWS\$1IN\$1MANIFEST](tm-debugging-aggregate-errors.md#tm-error-ERROR_TOO_MANY_INVALID_ROWS_IN_MANIFEST) terminal error is reported in the manifest summary.

For information about correcting JSON Line errors, see [Fixing training errors](tm-debugging-fixing-validation-errors.md). 



## JSON line error format


Amazon Rekognition Custom Labels adds non-terminal validation error information to image level and object localization format JSON Lines. For more information, see [Creating a manifest file](md-create-manifest-file.md).

### Image Level Errors


The following example shows the `Error` arrays in an image level JSON Line. There are two sets of errors. Errors related to label attribute metadata (in this example, sport-metadata) and errors related to the image. An error includes an error code (code), error message (message). For more information, see [Importing image-level labels in manifest files](md-create-manifest-file-classification.md). 

```
{
    "source-ref": String,
    "sport": Number,
    "sport-metadata": {
        "class-name": String,
        "confidence": Float,
        "type": String,
        "job-name": String,
        "human-annotated": String,
        "creation-date": String,
        "errors": [
            {
                "code": String, # error codes for label
                "message": String # Description and additional contextual details of the error
            }
        ] 
    },
    "errors": [
        {
            "code": String, # error codes for image
            "message": String # Description and additional contextual details of the error
        }
    ]
}
```

### Object localization errors


The following example show the error arrays in an object localization JSON Line. The JSON Line contains an `Errors` array information for fields in the following JSON Line sections. Each `Error` object includes the error code and the error message.
+ *label attribute* – Errors for the label attribute fields. See `bounding-box` in the example. 
+ *annotations* – Annotation errors (bounding boxes) are stored in the `annotations` array inside the label attribute.
+ *label attribute-metadata* – Errors for the label attribute metadata. See `bounding-box-metadata` in the example.
+ *image* – Errors not related to the label attribute, annotation, and label attribute metadata fields. 

For more information, see [Object localization in manifest files](md-create-manifest-file-object-detection.md). 

```
{
    "source-ref": String,
    "bounding-box": {
        "image_size": [
            {
                "width": Int,
                "height": Int,
                "depth":Int,
            }
        ],
        "annotations": [
            {
                "class_id": Int,
                "left": Int,
                "top": Int,
                "width": Int,
                "height": Int,
                "errors": [   # annotation field errors
                    {
                        "code": String, # annotation field error code
                        "message": String # Description and additional contextual details of the error
                    }
                ]
            }
        ],
        "errors": [ #label attribute field errors
            {
                "code": String, # error code
                "message": String # Description and additional contextual details of the error
            }
        ] 
    },
    "bounding-box-metadata": {
        "objects": [
            {
                "confidence": Float
            }
        ],
        "class-map": {
            String: String
        }, 
        "type": String,
        "human-annotated": String,
        "creation-date": String,
        "job-name": String,
        "errors": [  #metadata field errors
            {
                "code": String, # error code
                "message": String # Description and additional contextual details of the error
            }
        ] 
    },
   "errors": [  # image errors
        {
            "code": String, # error code
            "message": String # Description and additional contextual details of the error
        }
    ] 
 }
```

## Example JSON line error


The following object localization JSON Line (formatted for readability) shows an [ERROR\$1BOUNDING\$1BOX\$1TOO\$1SMALL](tm-debugging-json-line-errors.md#tm-error-ERROR_BOUNDING_BOX_TOO_SMALL) error. In this example, the bounding box dimensions (height and width) aren't greater than 1 x 1.

```
{
    "source-ref": "s3://bucket/Manifests/images/199940-1791.jpg",
    "bounding-box": {
        "image_size": [
            {
                "width": 3000,
                "height": 3000,
                "depth": 3
            }
        ],
        "annotations": [
            {
                "class_id": 1,
                "top": 0,
                "left": 0,
                "width": 1,
                "height": 1, 
                "errors": [
                    {
                        "code": "ERROR_BOUNDING_BOX_TOO_SMALL",
                        "message": "The height and width of the bounding box is too small."
                    }
                ]
            },
            {
                "class_id": 0,
                "top": 65,
                "left": 86,
                "width": 220,
                "height": 334
            }
        ]
    },
    "bounding-box-metadata": {
        "objects": [
            {
                "confidence": 1
            },
            {
                "confidence": 1
            }
        ],
        "class-map": {
            "0": "Echo",
            "1": "Echo Dot"
        },
        "type": "groundtruth/object-detection",
        "human-annotated": "yes",
        "creation-date": "2019-11-20T02:57:28.288286",
        "job-name": "my job"
    }
}
```

# Getting the validation results


The validation results contain error information for [List of terminal manifest content errors](tm-debugging.md#tm-error-category-combined-terminal) and [List of non-terminal JSON line validation errors](tm-debugging.md#tm-error-category-non-terminal-errors). There are three validation results files.
+ *training\$1manifest\$1with\$1validation.json* – A copy of the training dataset manifest file with JSON Line error information added.
+ *testing\$1manifest\$1with\$1validation.json* – A copy of the testing dataset manifest file with JSON Line error error information added. 
+ *manifest\$1summary.json* – A summary of manifest content errors and JSON Line errors found in the training and testing datasets. For more information, see [Understanding the manifest summary](tm-debugging-summary.md).

For information about the contents of the training and testing validation manifests, see [Debugging a failed model training](tm-debugging.md). 

**Note**  
The validation results are created only if no [List of terminal manifest file errors](tm-debugging.md#tm-error-category-terminal) are generated during training.
If a [service error](tm-debugging.md#tm-error-category-service) occurs after the training and testing manifest are validated, the validation results are created, but the response from [DescribeProjectVersions](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_DescribeProjectVersions) doesn't include the validation results file locations.

After training completes or fails, you can download the validation results by using the Amazon Rekognition Custom Labels console or get the Amazon S3 bucket location by calling [DescribeProjectVersions](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_DescribeProjectVersions) API.

## Getting validation results (Console)


If you are using the console to train your model, you can download the validation results from a project's list of models, as shown in the following diagram. The Models panel shows model training and validation results with option to download validation results.

![\[Interface showing model training and validation results with option to download validation results.\]](http://docs.aws.amazon.com/rekognition/latest/customlabels-dg/images/models-validation-results.jpg)


You can also access download the validation results from a model's details page. The details page shows the dataset details with status, training and test datasets, and download links for manifest summary, training validation manifest, and testing validation manifest.

![\[Screenshot of the dataset details panel with status, links to training and test datasets, and download links for manifest items.\]](http://docs.aws.amazon.com/rekognition/latest/customlabels-dg/images/model-validation-results.jpg)


For more information, see [Training a model (Console)](training-model.md#tm-console). 

## Getting validation results (SDK)


After model training completes, Amazon Rekognition Custom Labels stores the validation results in the Amazon S3 bucket specified during training. You can get the S3 bucket location by calling the [DescribeProjectVersions](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_DescribeProjectVersions) API, after training completes. To train a model, see [Training a model (SDK)](training-model.md#tm-sdk).

A [ValidationData](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_ValidationData) object is returned for the training dataset ([TrainingDataResult](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_TrainingDataResult)) and the testing dataset ([TestingDataResult](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_TestingDataResult)). The manifest summary is returned in `ManifestSummary`.

After you get the Amazon S3 bucket location, you can download the validation results. For more information, see [How do I download an object from an S3 bucket?](https://docs.aws.amazon.com/AmazonS3/latest/user-guide/download-objects.html). You can also use the [GetObject](https://docs.aws.amazon.com/AmazonS3/latest/dev/GettingObjectsUsingAPIs.html) operation.

**To get validation data (SDK)**

1. If you haven't already done so, install and configure the AWS CLI and the AWS SDKs. For more information, see [Step 4: Set up the AWS CLI and AWS SDKs](su-awscli-sdk.md).

1. Use the following example to get the location of the validation results. 

------
#### [ Python ]

   Replace `project_arn` with the Amazon Resource Name (ARN) of the project that contains the model. For more information, see [Managing an Amazon Rekognition Custom Labels project](managing-project.md). Replace `version_name` with the name of the model version. For more information, see [Training a model (SDK)](training-model.md#tm-sdk). 

   ```
   import boto3
   import io
   from io import BytesIO
   import sys
   import json
   
   
   def describe_model(project_arn, version_name):
   
       client=boto3.client('rekognition')
       
       response=client.describe_project_versions(ProjectArn=project_arn,
           VersionNames=[version_name])
   
       for model in response['ProjectVersionDescriptions']:
           print(json.dumps(model,indent=4,default=str))
          
   def main():
   
       project_arn='project_arn'
       version_name='version_name'
   
       describe_model(project_arn, version_name)
   
   if __name__ == "__main__":
       main()
   ```

------

1. In the program output, note the `Validation` field within the `TestingDataResult` and `TrainingDataResult` objects. The manifest summary is in `ManifestSummary`.

# Fixing training errors


You use the manifest summary to identify [List of terminal manifest content errors](tm-debugging.md#tm-error-category-combined-terminal) and [List of non-terminal JSON line validation errors](tm-debugging.md#tm-error-category-non-terminal-errors) encountered during training. You must fix manifest content errors. We recommend that you also fix non-terminal JSON Line errors. For information about specific errors, see [Non-Terminal JSON Line Validation Errors](tm-debugging-json-line-errors.md) and [Terminal manifest content errors](tm-debugging-aggregate-errors.md).

You can makes fixes to the training or testing dataset used for training. Alternatively, you can make the fixes in the training and testing validation manifest files and use them to train the model. 

After you make your fixes, you need to import the updated manifests(s) and retrain the model. For more information, see [Creating a manifest file](md-create-manifest-file.md).

The following procedure shows you how to use the manifest summary to fix terminal manifest content errors. The procedure also shows you how to locate and fix JSON Line errors in the training and testing validation manifests. 

**To fix Amazon Rekognition Custom Labels training errors**

1. Download the validation results files. The file names are *training\$1manifest\$1with\$1validation.json*, *testing\$1manifest\$1with\$1validation.json* and *manifest\$1summary.json*. For more information, see [Getting the validation results](tm-debugging-getting-validation-data.md). 

1. Open the manifest summary file (*manifest\$1summary.json*). 

1. Fix any errors in the manifest summary. For more information, see [Understanding the manifest summary](tm-debugging-summary.md).

1. In the manifest summary, iterate through the `error_line_indices` array in `training` and fix the errors in `training_manifest_with_validation.json` at the corresponding JSON Line numbers. For more information, see [Understanding training and testing validation result manifests](tm-debugging-scope-json-line.md).

1. Iterate through the `error_line_indices` array in `testing` and fix the errors in `testing_manifest_with_validation.json` at the corresponding JSON Line numbers.

1. Retrain the model using the validation manifest files as the training and testing datasets. For more information, see [Training an Amazon Rekognition Custom Labels model](training-model.md). 

If you are using the AWS SDK and choose to fix the errors in the training or the test validation data manifest files, use the location of the validation data manifest files in the [TrainingData](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_TrainingData) and [TestingData](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_TestingData) input parameters to [CreateProjectVersion](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_CreateProjectVersion). For more information, see [Training a model (SDK)](training-model.md#tm-sdk). 

## JSON line error precedence


The following JSON Line errors are detected first. If any of these errors occur, validation of JSON Line errors is stopped. You must fix these errors before you can fix any of the other JSON Line errors 
+ MISSING\$1SOURCE\$1REF
+ ERROR\$1INVALID\$1SOURCE\$1REF\$1FORMAT
+ ERROR\$1NO\$1LABEL\$1ATTRIBUTES
+ ERROR\$1INVALID\$1LABEL\$1ATTRIBUTE\$1FORMAT
+ ERROR\$1INVALID\$1LABEL\$1ATTRIBUTE\$1METADATA\$1FORMAT
+ ERROR\$1MISSING\$1BOUNDING\$1BOX\$1CONFIDENCE
+ ERROR\$1MISSING\$1CLASS\$1MAP\$1ID
+ ERROR\$1INVALID\$1JSON\$1LINE

# Terminal manifest file errors


This topic describes the [List of terminal manifest file errors](tm-debugging.md#tm-error-category-terminal). Manifest file errors do not have an associated error code. The validation results manifests are not created when a terminal manifest file error occurs. For more information, see [Understanding the manifest summary](tm-debugging-summary.md). Terminal manifest errors prevent the reporting of [Non-Terminal JSON Line Validation Errors](tm-debugging-json-line-errors.md). 

## The manifest file extension or contents are invalid.


The training or testing manifest file doesn't have a file extension or its contents are invalid. 

**To fix error *The manifest file extension or contents are invalid.***
+ Check the following possible causes in both the training and testing manifest files.
  + The manifest file is missing a file extension. By convention the file extension is `.manifest`.
  +  The Amazon S3 bucket or key for the manifest file couldn't be found.

## The manifest file is empty.




The training or testing manifest file used for training exists, but it is empty. The manifest file needs a JSON Line for each image that you use for training and testing.

**To fix error *The manifest file is empty.***

1. Check which of the training or testing manifests are empty.

1. Add JSON Lines to the empty manifest file. For more information, see [Creating a manifest file](md-create-manifest-file.md). Alternatively, create a new dataset with the console. For more information, see [Creating training and test datasets with images](md-create-dataset.md).



## The manifest file size exceeds the maximum supported size.




The training or testing manifest file size (in bytes) is too large. For more information, see [Guidelines and quotas in Amazon Rekognition Custom Labels](limits.md). A manifest file can have less than the maximum number of JSON Lines and still exceed the maximum file size.

You can't use the Amazon Rekognition Custom Labels console to fix error *The manifest file size exceeds the maximum supported size*.

**To fix error *The manifest file size exceeds the maximum supported size.***

1. Check which of the training and testing manifests exceed the maximum file size.

1. Reduce the number of JSON Lines in the manifest files that are too large. For more information, see [Creating a manifest file](md-create-manifest-file.md).

## The S3 bucket permissions are incorrect.


Amazon Rekognition Custom Labels doesn't have permissions to one or more of the buckets containing the training and testing manifest files. 

You can't use the Amazon Rekognition Custom Labels console to fix this error.

**To fix error *The S3 bucket permissions are incorrect.***
+ Check the permissions for the bucket(s) containing the training and testing manifests. For more information, see [Step 2: Set up Amazon Rekognition Custom Labels console permissions](su-console-policy.md).

## Unable to write to output S3 bucket.




The service is unable to generate the training output files.

**To fix error *Unable to write to output S3 bucket.***
+ Check that the Amazon S3 bucket information in the [OutputConfig](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_OutputConfig) input parameter to [CreateProjectVersion](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_CreateProjectVersion) is correct. 

You can't use the Amazon Rekognition Custom Labels console to fix this error.

# Terminal manifest content errors


This topic describes the [List of terminal manifest content errors](tm-debugging.md#tm-error-category-combined-terminal) reported in the manifest summary. The manifest summary includes an error code and message for each detected error. For more information, see [Understanding the manifest summary](tm-debugging-summary.md). Terminal manifest content errors don't stop the reporting of [List of non-terminal JSON line validation errors](tm-debugging.md#tm-error-category-non-terminal-errors). 

## ERROR\$1TOO\$1MANY\$1INVALID\$1ROWS\$1IN\$1MANIFEST


### Error message


The manifest file contains too many invalid rows. 

### More information


An `ERROR_TOO_MANY_INVALID_ROWS_IN_MANIFEST` error occurs if there are too many JSON Lines that contain invalid content.

You can't use the Amazon Rekognition Custom Labels console to fix an `ERROR_TOO_MANY_INVALID_ROWS_IN_MANIFEST` error.

**To fix ERROR\$1TOO\$1MANY\$1INVALID\$1ROWS\$1IN\$1MANIFEST**

1. Check the manifest for JSON Line errors. For more information, see [Understanding training and testing validation result manifests](tm-debugging-scope-json-line.md).

1.  Fix JSON Lines that have errors For more information, see [Non-Terminal JSON Line Validation Errors](tm-debugging-json-line-errors.md). 



## ERROR\$1IMAGES\$1IN\$1MULTIPLE\$1S3\$1BUCKETS


### Error message


The manifest file contains images from multiple S3 buckets.

### More information


A manifest can only reference images stored in a single bucket. Each JSON Line stores the Amazon S3 location of an image location in the value of `source-ref`. In the following example, the bucket name is *my-bucket*. 

```
"source-ref": "s3://my-bucket/images/sunrise.png"
```

You can't use the Amazon Rekognition Custom Labels console to fix this error.

**To fix `ERROR_IMAGES_IN_MULTIPLE_S3_BUCKETS`**
+ Ensure that all your images are in the same Amazon S3 bucket and that the value of `source-ref` in every JSON Line references the bucket where your images are stored. Alternatively, choose a preferred Amazon S3 bucket and remove the JSON Lines where `source-ref` doesn't reference your preferred bucket. 



## ERROR\$1INVALID\$1PERMISSIONS\$1IMAGES\$1S3\$1BUCKET


### Error message


The permissions for the images S3 bucket are invalid.

### More information


The permissions on the Amazon S3 bucket that contains the images are incorrect.

You can't use the Amazon Rekognition Custom Labels console to fix this error.

**To fix `ERROR_INVALID_PERMISSIONS_IMAGES_S3_BUCKET`**
+ Check the permissions of the bucket containing the images. The value of the `source-ref` for an image contains the bucket location. 



## ERROR\$1INVALID\$1IMAGES\$1S3\$1BUCKET\$1OWNER


### Error message


Invalid owner id for images S3 bucket.

### More information


The owner of the bucket that contains the training or test images is different from the owner of the bucket that contains the training or test manifest. You can use the following command to find the owner of a bucket.

```
aws s3api get-bucket-acl --bucket amzn-s3-demo-bucket
```

The `OWNER` `ID` must match for the buckets that store the images and manifest files.

**To fix ERROR\$1INVALID\$1IMAGES\$1S3\$1BUCKET\$1OWNER**

1. Choose the desired owner of the training, testing, output, and image buckets. The owner must have permissions to use Amazon Rekognition Custom Labels.

1. For each bucket not currently owned by the desired owner, create a new Amazon S3 bucket owned by the preferred owner. 

1. Copy the old bucket contents to the new bucket. For more information, see [How can I copy objects between Amazon S3 buckets?](https://aws.amazon.com/premiumsupport/knowledge-center/move-objects-s3-bucket/).



You can't use the Amazon Rekognition Custom Labels console to fix this error.

## ERROR\$1INSUFFICIENT\$1IMAGES\$1PER\$1LABEL\$1FOR\$1AUTOSPLIT


### Error message


The manifest file contains insufficient labeled images per label to perform auto-split.

### More information


During model training, you can create a testing dataset by using 20% of the images from the training dataset. ERROR\$1INSUFFICIENT\$1IMAGES\$1PER\$1LABEL\$1FOR\$1AUTOSPLIT occurs when there aren't enough images to create an acceptable testing dataset.

You can't use the Amazon Rekognition Custom Labels console to fix this error.

**To fix ERROR\$1INSUFFICIENT\$1IMAGES\$1PER\$1LABEL\$1FOR\$1AUTOSPLIT**
+ Add more labeled image to your training dataset. You can add images in the Amazon Rekognition Custom Labels console by adding images to the training dataset, or by adding JSON Lines to your training manifest. For more information, see [Managing datasets](managing-dataset.md).



## ERROR\$1MANIFEST\$1TOO\$1FEW\$1LABELS


### Error message


The manifest file has too few labels.

### More information


Training and testing datasets have a required minumum number of labels. The minimum depends on if the dataset trains/tests a model to detect image-level labels (classification) or if the model detects object locations. If the training dataset is split to create a testing dataset, the number of labels in the dataset is determined after the training dataset is split. For more information, see [Guidelines and quotas in Amazon Rekognition Custom Labels](limits.md).

**To fix ERROR\$1MANIFEST\$1TOO\$1FEW\$1LABELS (console)**

1. Add more new labels to the dataset. For more information, see [Managing labels](md-labels.md). 

1. Add the new labels to images in the dataset. If your model detects image-level labels, see [Assigning image-level labels to an image](md-assign-image-level-labels.md). If your model detects object locations, see [Labeling objects with bounding boxes](md-localize-objects.md).



**To fix ERROR\$1MANIFEST\$1TOO\$1FEW\$1LABELS (JSON Line)**
+ Add JSON Lines for new images that have new labels. For more information, see [Creating a manifest file](md-create-manifest-file.md). If your model detects image-level labels, you add new labels names to the `class-name` field. For example, the label for the following image is *Sunrise*.

  ```
  {
      "source-ref": "s3://bucket/images/sunrise.png",
      "testdataset-classification_Sunrise": 1,
      "testdataset-classification_Sunrise-metadata": {
          "confidence": 1,
          "job-name": "labeling-job/testdataset-classification_Sunrise",
          "class-name": "Sunrise",
          "human-annotated": "yes",
          "creation-date": "2018-10-18T22:18:13.527256",
          "type": "groundtruth/image-classification"
      }
  }
  ```

   If your model detects object locations, add new labels to the `class-map`, as shown in the following example.

  ```
  {
  	"source-ref": "s3://custom-labels-bucket/images/IMG_1186.png",
  	"bounding-box": {
  		"image_size": [{
  			"width": 640,
  			"height": 480,
  			"depth": 3
  		}],
  		"annotations": [{
  			"class_id": 1,
  			"top": 251,
  			"left": 399,
  			"width": 155,
  			"height": 101
  		}, {
  			"class_id": 0,
  			"top": 65,
  			"left": 86,
  			"width": 220,
  			"height": 334
  		}]
  	},
  	"bounding-box-metadata": {
  		"objects": [{
  			"confidence": 1
  		}, {
  			"confidence": 1
  		}],
  		"class-map": {
  			"0": "Echo",
  			"1": "Echo Dot"
  		},
  		"type": "groundtruth/object-detection",
  		"human-annotated": "yes",
  		"creation-date": "2018-10-18T22:18:13.527256",
  		"job-name": "my job"
  	}
  }
  ```

  You need to map the class map table to the bounding box annotations. For more information, see [Object localization in manifest files](md-create-manifest-file-object-detection.md).

## ERROR\$1MANIFEST\$1TOO\$1MANY\$1LABELS


### Error message


The manifest file has too many labels.

#### More information


The number of unique labels in the manifest (dataset) is more than the allowed limit. If the training dataset is split to create a testing dataset, the mumber of labels is determined after the split. 

**To fix ERROR\$1MANIFEST\$1TOO\$1MANY\$1LABELS (Console)**
+ Remove labels from the dataset. For more information, see [Managing labels](md-labels.md). The labels are automatically removed from the images and bounding boxes in your dataset.



**To fix ERROR\$1MANIFEST\$1TOO\$1MANY\$1LABELS (JSON Line)**
+ Manifests with image level JSON Lines – If the image has a single label, remove the JSON Lines for images that use the desired label. If the JSON Line contains multiple labels, remove only the JSON object for the desired label. For more information, see [Adding multiple image-level labels to an image](md-create-manifest-file-classification.md#md-dataset-purpose-classification-multiple-labels). 

  Manifests with object location JSON Lines – Remove the bounding box and associated label information for the label that you want to remove. Do this for each JSON Line that contains the desired label. You need to remove the label from the `class-map` array and corresponding objects in the `objects` and `annotations` array. For more information, see [Object localization in manifest files](md-create-manifest-file-object-detection.md).

## ERROR\$1INSUFFICIENT\$1LABEL\$1OVERLAP


### Error message


Less than \$1\$1% label overlap between the training and testing manifest files.

### More information


There is less than 50% overlap between the testing dataset label names and the training dataset label names.

**To fix ERROR\$1INSUFFICIENT\$1LABEL\$1OVERLAP (Console)**
+ Remove labels from the training dataset. Alternatively, add more common labels to your testing dataset. For more information, see [Managing labels](md-labels.md). The labels are automatically removed from the images and bounding boxes in your dataset.



**To fix ERROR\$1INSUFFICIENT\$1LABEL\$1OVERLAP by removing labels from the training dataset (JSON Line)**
+ Manifests with image level JSON Lines – If the image has a single label, remove the JSON Line for the image that use the desired label. If the JSON Line contains multiple labels, remove only the JSON object for the desired label. For more information, see [Adding multiple image-level labels to an image](md-create-manifest-file-classification.md#md-dataset-purpose-classification-multiple-labels). Do this for each JSON Line in the manifest that contains the label that you want to remove.

  Manifests with object location JSON Lines – Remove the bounding box and associated label information for the label that you want to remove. Do this for each JSON Line that contains the desired label. You need to remove the label from the `class-map` array and corresponding objects in the `objects` and `annotations` array. For more information, see [Object localization in manifest files](md-create-manifest-file-object-detection.md).

**To fix ERROR\$1INSUFFICIENT\$1LABEL\$1OVERLAP by adding common labels to the testing dataset (JSON Line)**
+ Add JSON Lines to the testing dataset that include images labeled with labels already in the training dataset. For more information, see [Creating a manifest file](md-create-manifest-file.md).

## ERROR\$1MANIFEST\$1TOO\$1FEW\$1USABLE\$1LABELS


### Error message


The manifest file has too few usable labels.

### More information


A training manifest can contain JSON Lines in image-level label format and in object location format. Depending on type of JSON Lines found in the training manifest, Amazon Rekognition Custom Labels chooses to create a model that detects image-level labels, or a model that detects object locations. Amazon Rekognition Custom Labels filters out valid JSON records for JSON Lines that are not in the chosen format. ERROR\$1MANIFEST\$1TOO\$1FEW\$1USABLE\$1LABELS occurs when the number of labels in the chosen model type manifest is insufficient to train the model.

A minimum of 1 label is required to train a model that detects image-level labels. A minimum of 2 labels is required to train a model that object locations. 

**To fix ERROR\$1MANIFEST\$1TOO\$1FEW\$1USABLE\$1LABELS (Console)**

1. Check the `use_case` field in the manifest summary.

1. Add more labels to the training dataset for the use case (image level or object localization) that matches the value of `use_case`. For more information, see [Managing labels](md-labels.md). The labels are automatically removed from the images and bounding boxes in your dataset.

**To fix ERROR\$1MANIFEST\$1TOO\$1FEW\$1USABLE\$1LABELS (JSON Line)**

1. Check the `use_case` field in the manifest summary.

1. Add more labels to the training dataset for the use case (image level or object localization) that matches the value of `use_case`. For more information, see [Creating a manifest file](md-create-manifest-file.md).



## ERROR\$1INSUFFICIENT\$1USABLE\$1LABEL\$1OVERLAP


### Error message


Less than \$1\$1% usable label overlap between the training and testing manifest files.

### More information


 

A training manifest can contain JSON Lines in image-level label format and in object location format. Depending on the formats found in the training manifest, Amazon Rekognition Custom Labels chooses to create a model that detects image-level labels, or a model that detects object locations. Amazon Rekognition Custom Labels doesn't use valid JSON records for JSON Lines that are not in the chosen model format. ERROR\$1INSUFFICIENT\$1USABLE\$1LABEL\$1OVERLAP occurs when there is less than 50% overlap between the testing and training labels that are used.

**To fix ERROR\$1INSUFFICIENT\$1USABLE\$1LABEL\$1OVERLAP (Console)**
+ Remove labels from the training dataset. Alternatively, add more common labels to your testing dataset. For more information, see [Managing labels](md-labels.md). The labels are automatically removed from the images and bounding boxes in your dataset.



**To fix ERROR\$1INSUFFICIENT\$1USABLE\$1LABEL\$1OVERLAP by removing labels from the training dataset (JSON Line)**
+ Datasets used to detect image-level labels – If the image has a single label, remove the JSON Line for the image that use the desired label. If the JSON Line contains multiple labels, remove only the JSON object for the desired label. For more information, see [Adding multiple image-level labels to an image](md-create-manifest-file-classification.md#md-dataset-purpose-classification-multiple-labels). Do this for each JSON Line in the manifest that contains the label that you want to remove.

  Datasets used to detects object locations – Remove the bounding box and associated label information for the label that you want to remove. Do this for each JSON Line that contains the desired label. You need to remove the label from the `class-map` array and corresponding objects in the `objects` and `annotations` array. For more information, see [Object localization in manifest files](md-create-manifest-file-object-detection.md).

**To fix ERROR\$1INSUFFICIENT\$1USABLE\$1LABEL\$1OVERLAP by adding common labels to the testing dataset (JSON Line)**
+ Add JSON Lines to the testing dataset that include images labeled with labels already in the training dataset. For more information, see [Creating a manifest file](md-create-manifest-file.md).



## ERROR\$1FAILED\$1IMAGES\$1S3\$1COPY


### Error message


Failed to copy images from S3 bucket.

### More information


The service wasn't able to copy any of the images in your your dataset. 

You can't use the Amazon Rekognition Custom Labels console to fix this error.

**To fix ERROR\$1FAILED\$1IMAGES\$1S3\$1COPY**

1. Check the permissions of your images.

1. If you are using AWS KMS, check the bucket policy. For more information, see [Decrypting files encrypted with AWS Key Management Service](su-encrypt-bucket.md#su-kms-encryption).

## The manifest file has too many terminal errors.




There are too many JSON lines with terminal content errors.

**To fix `ERROR_TOO_MANY_RECORDS_IN_ERROR`**
+ Reduce the number of JSON Lines (images) with terminal content errors. For more information, see [Terminal manifest content errors](#tm-debugging-aggregate-errors). 

You can't use the Amazon Rekognition Custom Labels console to fix this error.

# Non-Terminal JSON Line Validation Errors


This topic lists the non-terminal JSON Line validation errors reported by Amazon Rekognition Custom Labels during training. The errors are reported in the training and testing validation manifest. For more information, see [Understanding training and testing validation result manifests](tm-debugging-scope-json-line.md). You can fix a non-terminal JSON Line error by updating the JSON Line in the training or test manifest file. You can also remove the JSON Line from the manifest, but doing so might reduce the quality of your model. If there are many non-terminal validation errors, you might find it easier to recreate the manifest file. Validation errors typically occur in manually created manifest files. For more information, see [Creating a manifest file](md-create-manifest-file.md). For information about fixing validation errors, see [Fixing training errors](tm-debugging-fixing-validation-errors.md). Some errors can be fixed by using the Amazon Rekognition Custom Labels console. 

## ERROR\$1MISSING\$1SOURCE\$1REF


### Error message


The source-ref key is missing.

### More information


The JSON Line `source-ref` field provides the Amazon S3 location of an image. This error occurs when the `source-ref` key is missing or is misspelt. This error typically occurs in manually created manifest files. For more information, see [Creating a manifest file](md-create-manifest-file.md).

**To fix `ERROR_MISSING_SOURCE_REF`**

1. Check that the `source-ref` key is present and is spelt correctly. A complete `source-ref` key and value is similar to the following. is `"source-ref": "s3://bucket/path/image"`. 

1. Update or the `source-ref` key in the JSON Line. Alternatively, remove, the JSON Line from the manifest file. 

You can't use the Amazon Rekognition Custom Labels console to fix this error.

## ERROR\$1INVALID\$1SOURCE\$1REF\$1FORMAT


### Error message


The format of the source-ref value is invalid. 

### More information


The `source-ref` key is present in the JSON Line, but the schema of the Amazon S3 path is incorrect. For example, the path is `https://....` instead of `S3://....`. An ERROR\$1INVALID\$1SOURCE\$1REF\$1FORMAT error typically occurs in manually created manifest files. For more information, see [Creating a manifest file](md-create-manifest-file.md). 

**To fix `ERROR_INVALID_SOURCE_REF_FORMAT`**

1. Check that the schema is `"source-ref": "s3://bucket/path/image"`. For example, `"source-ref": "s3://custom-labels-console-us-east-1-1111111111/images/000000242287.jpg"`. 

1. Update, or remove, the JSON Line in the manifest file. 

 You can't use the Amazon Rekognition Custom Labels console to fix this `ERROR_INVALID_SOURCE_REF_FORMAT`.

## ERROR\$1NO\$1LABEL\$1ATTRIBUTES


### Error message


No label attributes found.

### More information


The label attribute or the label attribute `-metadata` key name (or both) is invalid or missing. In the following example, `ERROR_NO_LABEL_ATTRIBUTES` occurs whenever the `bounding-box` or `bounding-box-metadata` key (or both) is missing. For more information, see [Creating a manifest file](md-create-manifest-file.md).

```
{
	"source-ref": "s3://custom-labels-bucket/images/IMG_1186.png",
	"bounding-box": {
		"image_size": [{
			"width": 640,
			"height": 480,
			"depth": 3
		}],
		"annotations": [{
			"class_id": 1,
			"top": 251,
			"left": 399,
			"width": 155,
			"height": 101
		}, {
			"class_id": 0,
			"top": 65,
			"left": 86,
			"width": 220,
			"height": 334
		}]
	},
	"bounding-box-metadata": {
		"objects": [{
			"confidence": 1
		}, {
			"confidence": 1
		}],
		"class-map": {
			"0": "Echo",
			"1": "Echo Dot"
		},
		"type": "groundtruth/object-detection",
		"human-annotated": "yes",
		"creation-date": "2018-10-18T22:18:13.527256",
		"job-name": "my job"
	}
}
```

 A `ERROR_NO_LABEL_ATTRIBUTES` error typically occurs in a manually created manifest file. For more information, see [Creating a manifest file](md-create-manifest-file.md). 

**To fix `ERROR_NO_LABEL_ATTRIBUTES`**

1. Check that label attribute identifier and label attribute identifer `-metadata` keys are present and that the key names are spelt correctly. 

1. Update, or remove, the JSON Line in the manifest file.

You can't use the Amazon Rekognition Custom Labels console to fix `ERROR_NO_LABEL_ATTRIBUTES` .

## ERROR\$1INVALID\$1LABEL\$1ATTRIBUTE\$1FORMAT


### Error message


The format of the label attribute \$1\$1 is invalid.

### More information


The schema for the label attribute key is missing or invalid. An ERROR\$1INVALID\$1LABEL\$1ATTRIBUTE\$1FORMAT error typically occurs in manually created manifest files. for more information, see [Creating a manifest file](md-create-manifest-file.md). 

**To fix `ERROR_INVALID_LABEL_ATTRIBUTE_FORMAT`**

1. Check that the JSON Line section for the label attribute key is correct. In the following example object location example, the `image_size` and `annotations` objects must be correct. The label attribute key is named `bounding-box`.

   ```
   	"bounding-box": {
   		"image_size": [{
   			"width": 640,
   			"height": 480,
   			"depth": 3
   		}],
   		"annotations": [{
   			"class_id": 1,
   			"top": 251,
   			"left": 399,
   			"width": 155,
   			"height": 101
   		}, {
   			"class_id": 0,
   			"top": 65,
   			"left": 86,
   			"width": 220,
   			"height": 334
   		}]
   	},
   ```

   

1. Update, or remove, the JSON Line in the manifest file.

You can't use the Amazon Rekognition Custom Labels console to fix this error.

## ERROR\$1INVALID\$1LABEL\$1ATTRIBUTE\$1METADATA\$1FORMAT


### Error message


The format of the label attribute metadata is invalid.

### More information


The schema for the label attribute metadata key is missing or invalid. An ERROR\$1INVALID\$1LABEL\$1ATTRIBUTE\$1METADATA\$1FORMAT error typically occurs in manually created manifest files. For more information, see [Creating a manifest file](md-create-manifest-file.md).

**To fix `ERROR_INVALID_LABEL_ATTRIBUTE_FORMAT`**

1. Check that the JSON Line schema for the label attribute metadata key is similar to the following example. The label attribute metadata key is named `bounding-box-metadata`.

   ```
   	"bounding-box-metadata": {
   		"objects": [{
   			"confidence": 1
   		}, {
   			"confidence": 1
   		}],
   		"class-map": {
   			"0": "Echo",
   			"1": "Echo Dot"
   		},
   		"type": "groundtruth/object-detection",
   		"human-annotated": "yes",
   		"creation-date": "2018-10-18T22:18:13.527256",
   		"job-name": "my job"
   	}
   ```

   

1. Update, or remove, the JSON Line in the manifest file.



You can't use the Amazon Rekognition Custom Labels console to fix this error.

## ERROR\$1NO\$1VALID\$1LABEL\$1ATTRIBUTES


### Error message


No valid label attributes found.

### More information


No valid label attributes were found in the JSON Line. Amazon Rekognition Custom Labels checks both the label attribute and the label attribute identifier. An ERROR\$1INVALID\$1LABEL\$1ATTRIBUTE\$1FORMAT error typically occurs in manually created manifest files. for more information, see [Creating a manifest file](md-create-manifest-file.md). 

If a JSON Line isn't in a supported SageMaker AI manifest format, Amazon Rekognition Custom Labels marks the JSON Line as invalid and an `ERROR_NO_VALID_LABEL_ATTRIBUTES` error is reported. Currently, Amazon Rekognition Custom Labels supports classification job and bounding box formats. For more information, see [Creating a manifest file](md-create-manifest-file.md).

**To fix `ERROR_NO_VALID_LABEL_ATTRIBUTES`**

1. Check that the JSON for the label attribute key and label attribute metadata is correct.

1. Update, or remove, the JSON Line in the manifest file. For more information, see [Creating a manifest file](md-create-manifest-file.md).

You can't use the Amazon Rekognition Custom Labels console to fix this error.

## ERROR\$1MISSING\$1BOUNDING\$1BOX\$1CONFIDENCE


### Error message


One or more bounding boxes has a missing confidence value.

### More information


The confidence key is missing for one or more object location bounding boxes. The confidence key for a bounding box is in the label attribute metadata, as shown in the following example. A ERROR\$1MISSING\$1BOUNDING\$1BOX\$1CONFIDENCE error typically occurs in manually created manifest files. For more information, see [Object localization in manifest files](md-create-manifest-file-object-detection.md).

```
	"bounding-box-metadata": {
		"objects": [{
			"confidence": 1
		}, {
			"confidence": 1
		}],
```

**To fix `ERROR_MISSING_BOUNDING_BOX_CONFIDENCE`**

1. Check that the `objects` array in the label attribute contains the same number of confidence keys as there are objects in the label attribute `annotations` array.

1. Update, or remove, the JSON Line in the manifest file.



You can't use the Amazon Rekognition Custom Labels console to fix this error.

## ERROR\$1MISSING\$1CLASS\$1MAP\$1ID


### Error message


One of more class ids is missing from the class map.

### More information


The `class_id` in an annotation (bounding box) object doesn't have a matching entry in the label attribute metadata class map (`class-map`). For more information, see [Object localization in manifest files](md-create-manifest-file-object-detection.md). A ERROR\$1MISSING\$1CLASS\$1MAP\$1ID error typically occurs in manually created manifest files.

**To fix ERROR\$1MISSING\$1CLASS\$1MAP\$1ID**

1. Check that the `class_id` value in each annotation (bounding box) object has a corresponding value in the `class-map` array, as shown in the following example. The `annotations` array and `class_map` array should have the same number of elements.

   ```
   {
   	"source-ref": "s3://custom-labels-bucket/images/IMG_1186.png",
   	"bounding-box": {
   		"image_size": [{
   			"width": 640,
   			"height": 480,
   			"depth": 3
   		}],
   		"annotations": [{
   			"class_id": 1, 
   			"top": 251,
   			"left": 399,
   			"width": 155,
   			"height": 101
   		}, {
   			"class_id": 0,
   			"top": 65,
   			"left": 86,
   			"width": 220,
   			"height": 334
   		}]
   	},
   	"bounding-box-metadata": {
   		"objects": [{
   			"confidence": 1
   		}, {
   			"confidence": 1
   		}],
   		"class-map": {
   			"0": "Echo",
   			"1": "Echo Dot"
   		}, 
   		"type": "groundtruth/object-detection",
   		"human-annotated": "yes",
   		"creation-date": "2018-10-18T22:18:13.527256",
   		"job-name": "my job"
   	}
   }
   ```

1. Update, or remove, the JSON Line in the manifest file.

You can't use the Amazon Rekognition Custom Labels console to fix this error.

## ERROR\$1INVALID\$1JSON\$1LINE


### Error message


The JSON Line has an invalid format.

### More information


An unexpected character was found in the JSON Line. The JSON Line is replaced with a new JSON Line that contains only the error information. An ERROR\$1INVALID\$1JSON\$1LINE error typically occurs in manually created manifest files. For more information, see [Object localization in manifest files](md-create-manifest-file-object-detection.md). 

You can't use the Amazon Rekognition Custom Labels console to fix this error.

**To fix `ERROR_INVALID_JSON_LINE`**

1. Open the manifest file and navigate to the JSON Line where the ERROR\$1INVALID\$1JSON\$1LINE error occurs.

1. Check that the JSON Line doesn't contain invalid characters and that required `;` or `,` characters are not missing.

1. Update, or remove, the JSON Line in the manifest file.

## ERROR\$1INVALID\$1IMAGE


### Error message


The image is invalid. Check S3 path and/or image properties.

### More information


The file referenced by `source-ref` is not a valid image. Potential causes include the image aspect ratio, the size of the image, and the image format.

For more information, see [Guidelines and quotas in Amazon Rekognition Custom Labels](limits.md).

**To fix `ERROR_INVALID_IMAGE`**

1. Check the following.
   + The aspect ratio of the image is less than 20:1.
   + The size of the image is greater than 15 MB
   + The image is in PNG or JPEG format. 
   + The path to the image in `source-ref` is correct.
   + The minimum image dimension of the image is greater 64 pixels x 64 pixels.
   + The maximum image dimension of the image is less than 4096 pixels x 4096 pixels.

1. Update, or remove, the JSON Line in the manifest file.

You can't use the Amazon Rekognition Custom Labels console to fix this error.

## ERROR\$1INVALID\$1IMAGE\$1DIMENSION


### Error message


The image dimension(s) do not conform to allowed dimensions. 

### More information


The image referenced by `source-ref` doesn't conform to the allowed image dimensions. The minimum dimension is 64 pixels. The maximum dimension is 4096 pixels. `ERROR_INVALID_IMAGE_DIMENSION` is reported for images with bounding boxes. 

For more information, see [Guidelines and quotas in Amazon Rekognition Custom Labels](limits.md).

**To fix `ERROR_INVALID_IMAGE_DIMENSION` (Console)**

1. Update the image in the Amazon S3 bucket with dimensions that Amazon Rekognition Custom Labels can process.

1. In the Amazon Rekognition Custom Labels console, do the following:

   1. Remove the existing bounding boxes from the image.

   1. Re-add the bounding boxes to the image.

   1. Save your changes.

   For more information, [Labeling objects with bounding boxes](md-localize-objects.md).

**To fix `ERROR_INVALID_IMAGE_DIMENSION` (SDK)**

1. Update the image in the Amazon S3 bucket with dimensions that Amazon Rekognition Custom Labels can process.

1. Get the existing JSON Line for the image by calling [ListDatasetEntries](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_ListDatasetEntries). For the `SourceRefContains` input parameter specify the Amazon S3 location and filename of the image.

1. Call [UpdateDatasetEntries](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_UpdateDatasetEntries) and provide the JSON line for the image. Make sure the value of `source-ref` matches the image location in the Amazon S3 bucket. Update the bounding box annotations to match the bounding box dimensions needed for the updated image.

   ```
   {
   	"source-ref": "s3://custom-labels-bucket/images/IMG_1186.png",
   	"bounding-box": {
   		"image_size": [{
   			"width": 640,
   			"height": 480,
   			"depth": 3
   		}],
   		"annotations": [{
   			"class_id": 1,
   			"top": 251,
   			"left": 399,
   			"width": 155,
   			"height": 101
   		}, {
   			"class_id": 0,
   			"top": 65,
   			"left": 86,
   			"width": 220,
   			"height": 334
   		}]
   	},
   	"bounding-box-metadata": {
   		"objects": [{
   			"confidence": 1
   		}, {
   			"confidence": 1
   		}],
   		"class-map": {
   			"0": "Echo",
   			"1": "Echo Dot"
   		},
   		"type": "groundtruth/object-detection",
   		"human-annotated": "yes",
   		"creation-date": "2013-11-18T02:53:27",
   		"job-name": "my job"
   	}
   }
   ```

    

## ERROR\$1INVALID\$1BOUNDING\$1BOX


### Error message


The bounding box has off frame values.

### More information


The bounding box information specifies an image that is either off the image frame or contains negative values.

For more information, see [Guidelines and quotas in Amazon Rekognition Custom Labels](limits.md).

**To fix `ERROR_INVALID_BOUNDING_BOX`**

1. Check the values of the bounding boxes in the `annotations` array. 

   ```
   	"bounding-box": {
   		"image_size": [{
   			"width": 640,
   			"height": 480,
   			"depth": 3
   		}],
   		"annotations": [{
   			"class_id": 1,
   			"top": 251,
   			"left": 399,
   			"width": 155,
   			"height": 101
   		}]
   	},
   ```

1. Update, or alternatively remove, the JSON Line from the manifest file.

You can't use the Amazon Rekognition Custom Labels console to fix this error.

## ERROR\$1NO\$1VALID\$1ANNOTATIONS


### Error message


No valid annotations found.

### More information


None of the annotation objects in the JSON Line contain valid bounding box information. 

**To fix `ERROR_NO_VALID_ANNOTATIONS`**

1. Update the `annotations` array to include valid bounding box objects. Also, check that corresponding bounding box information (`confidence` and `class_map`) in the label attribute metadata is correct. For more information, see [Object localization in manifest files](md-create-manifest-file-object-detection.md).

   ```
   {
   	"source-ref": "s3://custom-labels-bucket/images/IMG_1186.png",
   	"bounding-box": {
   		"image_size": [{
   			"width": 640,
   			"height": 480,
   			"depth": 3
   		}],
   		"annotations": [
   		   {              
   			"class_id": 1,    #annotation object
   			"top": 251,
   			"left": 399,
   			"width": 155,
   			"height": 101
   		}, {
   			"class_id": 0,
   			"top": 65,
   			"left": 86,
   			"width": 220,
   			"height": 334
   		}]
   	},
   	"bounding-box-metadata": {
   		"objects": [
   		>{                
   			"confidence": 1          #confidence  object
   		}, 
           {
   			"confidence": 1
   		}],
   		"class-map": {  
   			"0": "Echo",    #label 
   			"1": "Echo Dot"
   		},
   		"type": "groundtruth/object-detection",
   		"human-annotated": "yes",
   		"creation-date": "2018-10-18T22:18:13.527256",
   		"job-name": "my job"
   	}
   }
   ```

1. Update, or alternatively remove, the JSON Line from the manifest file.

You can't use the Amazon Rekognition Custom Labels console to fix this error.

## ERROR\$1BOUNDING\$1BOX\$1TOO\$1SMALL


### Error message


The height and width of the bounding box is too small.

### More information


The bounding box dimensions (height and width) have to be greater than 1 x 1 pixels.

During training, Amazon Rekognition Custom Labels resizes an image if any of its dimensions are greater than 1280 pixels (the source images aren't affected). The resulting bounding box heights and widths must be greater than 1 x 1 pixels. A bounding box location is stored in the `annotations` array of an object location JSON Line. For more information, see [Object localization in manifest files](md-create-manifest-file-object-detection.md) 

```
	"bounding-box": {
		"image_size": [{
			"width": 640,
			"height": 480,
			"depth": 3
		}],
		"annotations": [{
			"class_id": 1,
			"top": 251,
			"left": 399,
			"width": 155,
			"height": 101
		}]
	},
```

The error information is added to the annotation object.

**To fix ERROR\$1BOUNDING\$1BOX\$1TOO\$1SMALL**
+ Choose one of the following options.
  + Increase the size of bounding boxes that are too small.
  + Remove bounding boxes that are too small. For information about removing a bounding box, see [ERROR\$1TOO\$1MANY\$1BOUNDING\$1BOXES](#tm-error-ERROR_TOO_MANY_BOUNDING_BOXES).
  + Remove the image (JSON Line) from the manifest.





## ERROR\$1TOO\$1MANY\$1BOUNDING\$1BOXES


### Error message


There are more bounding boxes than the allowed maximum.

### More information


There are more bounding boxes than the allowed limit (50). You can remove excess bounding boxes in the Amazon Rekognition Custom Labels console, or you can remove them from the JSON Line.

**To fix `ERROR_TOO_MANY_BOUNDING_BOXES` (Console).**

1. Decide which bounding boxes to remove. 

1. Open the Amazon Rekognition console at [https://console.aws.amazon.com/rekognition/](https://console.aws.amazon.com/rekognition/).

1. Choose **Use Custom Labels**.

1. Choose **Get started**. 

1. In the left navigation pane, choose the project that contains the dataset that you want to use.

1. In the **Datasets** section, choose the dataset that you want to use.

1. In the dataset gallery page, choose **Start labeling** to enter labeling mode.

1. Choose the image that you want to remove bounding boxes from.

1. Choose **Draw bounding box**. 

1. In the drawing tool, choose the bounding box that you want to delete.

1. Press the delete key on your keyboard to delete the bounding box.

1. Repeat the previous 2 steps until you have deleted enough bounding boxes.

1. Choose **Done**

1. Choose **Save changes** to save your changes. 

1. Choose **Exit** to exit labeling mode.



**To fix ERROR\$1TOO\$1MANY\$1BOUNDING\$1BOXES (JSON Line).**

1. Open the manifest file and navigate to the JSON Line where the ERROR\$1TOO\$1MANY\$1BOUNDING\$1BOXES error occurs.

1. Remove the following for each bounding box that you want to remove. 
   + Remove the required `annotation` object from `annotations` array.
   + Remove the corresponding `confidence` object from the `objects` array in the label attribute metadata.
   + If no longer used by other bounding boxes, remove the label from the `class-map`.

   Use the following example to identify which items to remove.

   ```
   {
   	"source-ref": "s3://custom-labels-bucket/images/IMG_1186.png",
   	"bounding-box": {
   		"image_size": [{
   			"width": 640,
   			"height": 480,
   			"depth": 3
   		}],
   		"annotations": [
   		   {              
   			"class_id": 1,    #annotation object
   			"top": 251,
   			"left": 399,
   			"width": 155,
   			"height": 101
   		}, {
   			"class_id": 0,
   			"top": 65,
   			"left": 86,
   			"width": 220,
   			"height": 334
   		}]
   	},
   	"bounding-box-metadata": {
   		"objects": [
   		>{                
   			"confidence": 1          #confidence  object
   		}, 
           {
   			"confidence": 1
   		}],
   		"class-map": {  
   			"0": "Echo",    #label 
   			"1": "Echo Dot"
   		},
   		"type": "groundtruth/object-detection",
   		"human-annotated": "yes",
   		"creation-date": "2018-10-18T22:18:13.527256",
   		"job-name": "my job"
   	}
   }
   ```



## WARNING\$1UNANNOTATED\$1RECORD


### Warning Message


Record is unannotated.

### More information


An image added to a dataset by using the Amazon Rekognition Custom Labels console wasn't labeled. The JSON line for the image isn't used for training. 

```
{
    "source-ref": "s3://bucket/images/IMG_1186.png",
    "warnings": [
        {
            "code": "WARNING_UNANNOTATED_RECORD",
            "message": "Record is unannotated."
        } 
    ]
}
```

**To fix WARNING\$1UNANNOTATED\$1RECORD**
+ Label the image by using the Amazon Rekognition Custom Labels console. For instructions, see [Assigning image-level labels to an image](md-assign-image-level-labels.md).





## WARNING\$1NO\$1ANNOTATIONS


### Warning Message


No annotations provided.

### More information


A JSON Line in Object Localization format doesn't contain any bounding box information, despite being annotated by a human (`human-annotated = yes`). The JSON Line is valid, but isn't used for training. For more information, see [Understanding training and testing validation result manifests](tm-debugging-scope-json-line.md). 

```
{
    "source-ref": "s3://bucket/images/IMG_1186.png",
    "bounding-box": {
        "image_size": [
            {
                "width": 640,
                "height": 480,
                "depth": 3
            }
        ],
        "annotations": [
           
        ],
        "warnings": [
            {
                "code": "WARNING_NO_ATTRIBUTE_ANNOTATIONS",
                "message": "No attribute annotations were found."
            }
        ]
    },
    "bounding-box-metadata": {
        "objects": [
           
        ],
        "class-map": {
           
        },
        "type": "groundtruth/object-detection",
        "human-annotated": "yes",
        "creation-date": "2013-11-18 02:53:27",
        "job-name": "my job"
    },
    "warnings": [
        {
            "code": "WARNING_NO_ANNOTATIONS",
            "message": "No annotations were found."
        } 
    ]
}
```

**To fix WARNING\$1NO\$1ANNOTATIONS**
+ Choose one of the following options.
  + Add the bounding box (`annotations`) information to the JSON Line. For more information, see [Object localization in manifest files](md-create-manifest-file-object-detection.md). 
  + Remove the image (JSON Line) from the manifest.

## WARNING\$1NO\$1ATTRIBUTE\$1ANNOTATIONS


### Warning Message


No attribute annotations provided.

#### More information


A JSON Line in Object Localization format doesn't contain any bounding box annotation information, despite being annotated by a human (`human-annotated = yes`). The `annotations` array is not present or is not populuated. The JSON Line is valid, but isn't used for training. For more information, see [Understanding training and testing validation result manifests](tm-debugging-scope-json-line.md). 

```
{
    "source-ref": "s3://bucket/images/IMG_1186.png",
    "bounding-box": {
        "image_size": [
            {
                "width": 640,
                "height": 480,
                "depth": 3
            }
        ],
        "annotations": [
           
        ],
        "warnings": [
            {
                "code": "WARNING_NO_ATTRIBUTE_ANNOTATIONS",
                "message": "No attribute annotations were found."
            }
        ]
    },
    "bounding-box-metadata": {
        "objects": [
           
        ],
        "class-map": {
           
        },
        "type": "groundtruth/object-detection",
        "human-annotated": "yes",
        "creation-date": "2013-11-18 02:53:27",
        "job-name": "my job"
    },
    "warnings": [
        {
            "code": "WARNING_NO_ANNOTATIONS",
            "message": "No annotations were found."
        }
    ]
}
```

**To fix WARNING\$1NO\$1ATTRIBUTE\$1ANNOTATIONS**
+ Choose one of the following options.
  + Add one or more bounding box `annotation` objects to the JSON Line. For more information, see [Object localization in manifest files](md-create-manifest-file-object-detection.md). 
  + Remove the bounding box attribute.
  + Remove the image (JSON Line) from the manifest. If other valid bounding box attributes exist in the JSON Line, you can instead remove just the invalid bounding box attribute from the JSON Line.

## ERROR\$1UNSUPPORTED\$1USE\$1CASE\$1TYPE


### Warning Message


### More information


The value of the `type `field isn't `groundtruth/image-classification` or `groundtruth/object-detection`. For more information, see [Creating a manifest file](md-create-manifest-file.md). 

```
{
    "source-ref": "s3://bucket/test_normal_8.jpg",
    "BB": {
        "annotations": [
            {
                "left": 1768,
                "top": 1007,
                "width": 448,
                "height": 295,
                "class_id": 0
            },
            {
                "left": 1794,
                "top": 1306,
                "width": 432,
                "height": 411,
                "class_id": 1
            },
            {
                "left": 2568,
                "top": 1346,
                "width": 710,
                "height": 305,
                "class_id": 2
            },
            {
                "left": 2571,
                "top": 1020,
                "width": 644,
                "height": 312,
                "class_id": 3
            }
        ],
        "image_size": [
            {
                "width": 4000,
                "height": 2667,
                "depth": 3
            }
        ]
    },
    "BB-metadata": {
        "job-name": "labeling-job/BB",
        "class-map": {
            "0": "comparator",
            "1": "pot_resistor",
            "2": "ir_phototransistor",
            "3": "ir_led"
        },
        "human-annotated": "yes",
        "objects": [
            {
                "confidence": 1
            },
            {
                "confidence": 1
            },
            {
                "confidence": 1
            },
            {
                "confidence": 1
            }
        ],
        "creation-date": "2021-06-22T09:58:34.811Z",
        "type": "groundtruth/wrongtype",
        "cl-errors": [
            {
                "code": "ERROR_UNSUPPORTED_USE_CASE_TYPE",
                "message": "The use case type of the BB-metadata label attribute metadata is unsupported. Check the type field."
            }
        ]
    },
    "cl-metadata": {
        "is_labeled": true
    },
    "cl-errors": [
        {
            "code": "ERROR_NO_VALID_LABEL_ATTRIBUTES",
            "message": "No valid label attributes found."
        }
    ]
}
```

**To fix ERROR\$1UNSUPPORTED\$1USE\$1CASE\$1TYPE**
+ Choose one of the following options:
  + Change the value of the `type`field to `groundtruth/image-classification` or `groundtruth/object-detection`, depending on the type of model that you want to create. For more information, see [Creating a manifest file](md-create-manifest-file.md). 
  + Remove the image (JSON Line) from the manifest.

## ERROR\$1INVALID\$1LABEL\$1NAME\$1LENGTH


### More information


The length of a label name is too long. The maximum length is 256 characters. 

**To fix ERROR\$1INVALID\$1LABEL\$1NAME\$1LENGTH**
+ Choose one of the following options:
  + Reduce the length of the label name to 256 characters or less.
  + Remove the image (JSON Line) from the manifest.