

Terjemahan disediakan oleh mesin penerjemah. Jika konten terjemahan yang diberikan bertentangan dengan versi bahasa Inggris aslinya, utamakan versi bahasa Inggris.

# Mendistribusikan dataset pelatihan (SDK)
<a name="md-distributing-datasets"></a>

Label Kustom Rekognition Amazon memerlukan kumpulan data pelatihan dan kumpulan data pengujian untuk melatih model Anda. 

Jika Anda menggunakan API, Anda dapat menggunakan [DistributeDatasetEntries](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_DistributeDatasetEntries)API untuk mendistribusikan 20% kumpulan data pelatihan ke dalam kumpulan data pengujian kosong. Mendistribusikan kumpulan data pelatihan dapat berguna jika Anda hanya memiliki satu file manifes yang tersedia. Gunakan file manifes tunggal untuk membuat kumpulan data pelatihan Anda. Kemudian buat kumpulan data pengujian kosong dan gunakan `DistributeDatasetEntries` untuk mengisi kumpulan data pengujian.

**catatan**  
Jika Anda menggunakan konsol Label Kustom Amazon Rekognition dan memulai dengan satu proyek kumpulan data, Label Kustom Rekognition Amazon membagi (mendistribusikan) kumpulan data pelatihan, selama pelatihan, untuk membuat kumpulan data pengujian. 20% entri kumpulan data pelatihan dipindahkan ke kumpulan data pengujian.

**Untuk mendistribusikan dataset pelatihan (SDK)**

1. Jika Anda belum melakukannya, instal dan konfigurasikan AWS CLI dan AWS SDKs. Untuk informasi selengkapnya, lihat [Langkah 4: Mengatur AWS CLI and AWS SDK](su-awscli-sdk.md).

1. Buat proyek. Untuk informasi selengkapnya, lihat [Membuat proyek Amazon Rekognition Custom Labels (SDK)](mp-create-project.md#mp-create-project-sdk).

1. Buat dataset pelatihan Anda. Untuk informasi tentang kumpulan data, lihat. [Membuat kumpulan data pelatihan dan pengujian](creating-datasets.md)

1. Buat dataset pengujian kosong.

1. Gunakan kode contoh berikut untuk mendistribusikan 20% entri kumpulan data pelatihan ke dalam kumpulan data pengujian. Anda bisa mendapatkan Amazon Resource Names (ARN) untuk kumpulan data proyek dengan menelepon. [DescribeProjects](https://docs.aws.amazon.com/rekognition/latest/APIReference/API_DescribeProjects) Untuk kode sampel, lihat [Menjelaskan proyek (SDK)](md-describing-project-sdk.md).

------
#### [ AWS CLI ]

   Ubah nilai `training_dataset-arn` dan `test_dataset_arn` dengan ARNS dari kumpulan data yang ingin Anda gunakan.

   ```
   aws rekognition distribute-dataset-entries --datasets ['{"Arn": "{{training_dataset_arn}}"}, {"Arn": "{{test_dataset_arn}}"}'] \
     --profile custom-labels-access
   ```

------
#### [ Python ]

   Gunakan kode berikut. Sediakan parameter baris perintah berikut:
   + training\_dataset\_arn — ARN dari kumpulan data pelatihan tempat Anda mendistribusikan entri.
   + test\_dataset\_arn — ARN dari kumpulan data pengujian tempat Anda mendistribusikan entri.

   ```
   # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
   # SPDX-License-Identifier: Apache-2.0
   
   import argparse
   import logging
   import time
   import json
   import boto3
   
   from botocore.exceptions import ClientError
   
   logger = logging.getLogger(__name__)
   
   
   def check_dataset_status(rek_client, dataset_arn):
       """
       Checks the current status of a dataset.
       :param rek_client: The Amazon Rekognition Custom Labels Boto3 client.
       :param dataset_arn: The dataset that you want to check.
       :return: The dataset status and status message.
       """
       finished = False
       status = ""
       status_message = ""
   
       while finished is False:
   
           dataset = rek_client.describe_dataset(DatasetArn=dataset_arn)
   
           status = dataset['DatasetDescription']['Status']
           status_message = dataset['DatasetDescription']['StatusMessage']
   
           if status == "UPDATE_IN_PROGRESS":
   
               logger.info("Distributing dataset: %s ", dataset_arn)
               time.sleep(5)
               continue
   
           if status == "UPDATE_COMPLETE":
               logger.info(
                   "Dataset distribution complete: %s : %s : %s",
                       status, status_message, dataset_arn)
               finished = True
               continue
   
           if status == "UPDATE_FAILED":
               logger.exception(
                   "Dataset distribution failed: %s : %s : %s",
                       status, status_message, dataset_arn)
               finished = True
               break
   
           logger.exception(
               "Failed. Unexpected state for dataset distribution: %s : %s : %s",
               status, status_message, dataset_arn)
           finished = True
           status_message = "An unexpected error occurred while distributing the dataset"
           break
   
       return status, status_message
   
   
   def distribute_dataset_entries(rek_client, training_dataset_arn, test_dataset_arn):
       """
       Distributes 20% of the supplied training dataset into the supplied test dataset.
       :param rek_client: The Amazon Rekognition Custom Labels Boto3 client.
       :param training_dataset_arn: The ARN of the training dataset that you distribute entries from.
       :param test_dataset_arn: The ARN of the test dataset that you distribute entries to.
       """
   
       try:
           # List dataset labels.
           logger.info("Distributing training dataset entries (%s) into test dataset (%s).",
               training_dataset_arn,test_dataset_arn)
                       
   
           datasets = json.loads(
               '[{"Arn" : "' + str(training_dataset_arn) + '"},{"Arn" : "' + str(test_dataset_arn) + '"}]')
   
           rek_client.distribute_dataset_entries(
               Datasets=datasets
           )
   
           training_dataset_status, training_dataset_status_message = check_dataset_status(
               rek_client, training_dataset_arn)
           test_dataset_status, test_dataset_status_message = check_dataset_status(
               rek_client, test_dataset_arn)
   
           if training_dataset_status == 'UPDATE_COMPLETE' and test_dataset_status == "UPDATE_COMPLETE":
               print("Distribution complete")
           else:
               print("Distribution failed:")
               print(
                   f"\ttraining dataset: {training_dataset_status} : {training_dataset_status_message}")
               print(
                   f"\ttest dataset: {test_dataset_status} : {test_dataset_status_message}")
   
       except ClientError as err:
           logger.exception(
               "Couldn't distribute dataset: %s",err.response['Error']['Message'] )
           raise
   
   
   def add_arguments(parser):
       """
       Adds command line arguments to the parser.
       :param parser: The command line parser.
       """
   
       parser.add_argument(
           "training_dataset_arn", help="The ARN of the training dataset that you want to distribute from."
       )
   
       parser.add_argument(
           "test_dataset_arn", help="The ARN of the test dataset that you want to distribute to."
       )
   
   
   def main():
   
       logging.basicConfig(level=logging.INFO,
                           format="%(levelname)s: %(message)s")
   
       try:
   
           # Get command line arguments.
           parser = argparse.ArgumentParser(usage=argparse.SUPPRESS)
           add_arguments(parser)
           args = parser.parse_args()
   
           print(
               f"Distributing training dataset entries ({args.training_dataset_arn}) "\
               f"into test dataset ({args.test_dataset_arn}).")
   
           # Distribute the datasets.
   
           session = boto3.Session(profile_name='custom-labels-access')
           rekognition_client = session.client("rekognition")
   
           distribute_dataset_entries(rekognition_client,
                                      args.training_dataset_arn,
                                      args.test_dataset_arn)
   
           print("Finished distributing datasets.")
   
       except ClientError as err:
           logger.exception("Problem distributing datasets: %s", err)
           print(f"Problem listing dataset labels: {err}")
       except Exception as err:
           logger.exception("Problem distributing datasets: %s", err)
           print(f"Problem distributing datasets: {err}")
   
   
   if __name__ == "__main__":
       main()
   ```

------
#### [ Java V2 ]

   Gunakan kode berikut. Sediakan parameter baris perintah berikut:
   + training\_dataset\_arn — ARN dari kumpulan data pelatihan tempat Anda mendistribusikan entri.
   + test\_dataset\_arn — ARN dari kumpulan data pengujian tempat Anda mendistribusikan entri.

   ```
   /*
      Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
      SPDX-License-Identifier: Apache-2.0
   */
   package com.example.rekognition;
   
   import software.amazon.awssdk.auth.credentials.ProfileCredentialsProvider;
   import software.amazon.awssdk.regions.Region;
   import software.amazon.awssdk.services.rekognition.RekognitionClient;
   import software.amazon.awssdk.services.rekognition.model.DatasetDescription;
   import software.amazon.awssdk.services.rekognition.model.DatasetStatus;
   import software.amazon.awssdk.services.rekognition.model.DescribeDatasetRequest;
   import software.amazon.awssdk.services.rekognition.model.DescribeDatasetResponse;
   import software.amazon.awssdk.services.rekognition.model.DistributeDataset;
   import software.amazon.awssdk.services.rekognition.model.DistributeDatasetEntriesRequest;
   import software.amazon.awssdk.services.rekognition.model.RekognitionException;
   
   import java.util.ArrayList;
   import java.util.logging.Level;
   import java.util.logging.Logger;
   
   public class DistributeDatasetEntries {
   
       public static final Logger logger = Logger.getLogger(DistributeDatasetEntries.class.getName());
   
       public static DatasetStatus checkDatasetStatus(RekognitionClient rekClient, String datasetArn)
               throws Exception, RekognitionException {
   
           boolean distributed = false;
           DatasetStatus status = null;
   
           // Wait until distribution completes
   
           do {
   
               DescribeDatasetRequest describeDatasetRequest = DescribeDatasetRequest.builder().datasetArn(datasetArn)
                       .build();
               DescribeDatasetResponse describeDatasetResponse = rekClient.describeDataset(describeDatasetRequest);
   
               DatasetDescription datasetDescription = describeDatasetResponse.datasetDescription();
   
               status = datasetDescription.status();
   
               logger.log(Level.INFO, " dataset ARN: {0} ", datasetArn);
   
               switch (status) {
   
               case UPDATE_COMPLETE:
                   logger.log(Level.INFO, "Dataset updated");
                   distributed = true;
                   break;
   
               case UPDATE_IN_PROGRESS:
                   Thread.sleep(5000);
                   break;
   
               case UPDATE_FAILED:
                   String error = "Dataset distribution failed: " + datasetDescription.statusAsString() + " "
                           + datasetDescription.statusMessage() + " " + datasetArn;
                   logger.log(Level.SEVERE, error);
                   break;
   
               default:
                   String unexpectedError = "Unexpected distribution state: " + datasetDescription.statusAsString() + " "
                           + datasetDescription.statusMessage() + " " + datasetArn;
                   logger.log(Level.SEVERE, unexpectedError);
   
               }
   
           } while (distributed == false);
   
           return status;
   
       }
   
       public static void distributeMyDatasetEntries(RekognitionClient rekClient, String trainingDatasetArn,
               String testDatasetArn) throws Exception, RekognitionException {
   
           try {
   
               logger.log(Level.INFO, "Distributing {0} dataset to {1} ",
                       new Object[] { trainingDatasetArn, testDatasetArn });
   
               DistributeDataset distributeTrainingDataset = DistributeDataset.builder().arn(trainingDatasetArn).build();
   
               DistributeDataset distributeTestDataset = DistributeDataset.builder().arn(testDatasetArn).build();
   
               ArrayList<DistributeDataset> datasets = new ArrayList();
   
               datasets.add(distributeTrainingDataset);
               datasets.add(distributeTestDataset);
   
               DistributeDatasetEntriesRequest distributeDatasetEntriesRequest = DistributeDatasetEntriesRequest.builder()
                       .datasets(datasets).build();
   
               rekClient.distributeDatasetEntries(distributeDatasetEntriesRequest);
   
               DatasetStatus trainingStatus = checkDatasetStatus(rekClient, trainingDatasetArn);
               DatasetStatus testStatus = checkDatasetStatus(rekClient, testDatasetArn);
   
               if (trainingStatus == DatasetStatus.UPDATE_COMPLETE && testStatus == DatasetStatus.UPDATE_COMPLETE) {
                   logger.log(Level.INFO, "Successfully distributed dataset: {0}", trainingDatasetArn);
   
               } else {
   
                   throw new Exception("Failed to distribute dataset: " + trainingDatasetArn);
               }
   
           } catch (RekognitionException e) {
               logger.log(Level.SEVERE, "Could not distribute dataset: {0}", e.getMessage());
               throw e;
           }
   
       }
   
       public static void main(String[] args) {
   
           String trainingDatasetArn = null;
           String testDatasetArn = null;
   
           final String USAGE = "\n" + "Usage: " + "<training_dataset_arn> <test_dataset_arn>\n\n" + "Where:\n"
                   + "   training_dataset_arn - the ARN of the dataset that you want to distribute from.\n\n"
                   + "   test_dataset_arn - the ARN of the dataset that you want to distribute to.\n\n";
   
           if (args.length != 2) {
               System.out.println(USAGE);
               System.exit(1);
           }
   
           trainingDatasetArn = args[0];
           testDatasetArn = args[1];
   
           try {
   
               // Get the Rekognition client.
               RekognitionClient rekClient = RekognitionClient.builder()
                   .credentialsProvider(ProfileCredentialsProvider.create("custom-labels-access"))
                   .region(Region.US_WEST_2)
                   .build();
   
               // Distribute the dataset
               distributeMyDatasetEntries(rekClient, trainingDatasetArn, testDatasetArn);
   
               System.out.println("Datasets distributed.");
   
               rekClient.close();
   
           } catch (RekognitionException rekError) {
               logger.log(Level.SEVERE, "Rekognition client error: {0}", rekError.getMessage());
               System.exit(1);
           } catch (Exception rekError) {
               logger.log(Level.SEVERE, "Error: {0}", rekError.getMessage());
               System.exit(1);
           }
   
       }
   
   }
   ```

------