本文為英文版的機器翻譯版本，如內容有任何歧義或不一致之處，概以英文版為準。

# 使用 Neo 的編譯建議
<a name="inference-recommender-neo-compilation"></a>

在 Inference Recommender 中，您可以使用 Neo 編譯模型，並取得已編譯模型的端點建議。[SageMaker Neo](https://docs.aws.amazon.com/sagemaker/latest/dg/neo.html) 是一項可針對目標硬體平台 (也就是特定執行個體類型或環境) 最佳化模型的服務。使用 Neo 最佳化模型可能會改善託管模型的效能。

對於 Neo 支援的架構和容器，Inference Recommender 會自動建議 Neo 最佳化的建議。若要符合 Neo 編譯的資格，您的輸入必須符合以下先決條件：
+ 您使用的是 SageMaker AI 擁有的 [DLC](https://docs.aws.amazon.com/deep-learning-containers/latest/devguide/what-is-dlc.html) 或 XGBoost 容器。
+ 您使用的是 Neo 支援的架構版本。如需 Neo 支援的架構版本，請參閱 SageMaker Neo 文件中的[雲端執行個體](neo-supported-cloud.md#neo-supported-cloud-instances)。
+ Neo 要求您為模型提供正確的輸入資料形式。您可以在建立模型套件時，將此資料形式指定為 `[InferenceSpecification](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateModelPackage.html#sagemaker-CreateModelPackage-request-InferenceSpecification)` 中的 `[DataInputConfig](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_ModelInput.html#sagemaker-Type-ModelInput-DataInputConfig)`。如需每個架構正確資料形式的詳細資訊，請參閱 SageMaker Neo 文件中的[準備模型以進行編譯](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-compilation-preparing-model.html)。

  下列範例示範如何在 `InferenceSpecification` 中指定 `DataInputConfig` 欄位，其中 `data_input_configuration` 是包含字典格式之資料形式的變數 (例如 `{'input':[1,1024,1024,3]}`)。

  ```
  "InferenceSpecification": {
          "Containers": [
              {
                  "Image": dlc_uri,
                  "Framework": framework.upper(),
                  "FrameworkVersion": framework_version,
                  "NearestModelName": model_name,
                  "ModelInput": {"DataInputConfig": data_input_configuration},
              }
          ],
          "SupportedContentTypes": input_mime_types,  # required, must be non-null
          "SupportedResponseMIMETypes": [],
          "SupportedRealtimeInferenceInstanceTypes": supported_realtime_inference_types,  # optional
      }
  ```

如果您的請求中符合這些條件，則 Inference Recommender 會針對模型的編譯版本和未編譯版本執行案例，提供多種建議組合供您選擇。您可以比較相同推論建議的編譯版本和未編譯版本的組態，並判斷哪一個最適合您的使用案例。這些建議是按每個推論的成本來排序。

要獲得 Neo 編譯建議，除了確保您的輸入符合上述要求之外，您不必執行任何其他配置。如果您的輸入符合需求，Inference Recommender 會自動在您的模型上執行 Neo 編譯，而且您會收到包含 Neo 建議的回應。

如果您在 Neo 編譯期間遇到錯誤，請參閱[故障診斷 Neo 編譯錯誤](neo-troubleshooting-compilation.md)。

下表是您可能從 Inference Recommender 任務中取得的回應範例，其中包含已編譯模型的建議。如果 `InferenceSpecificationName` 欄位為 `None`，則建議是未編譯的模型。最後一列，其中 **InferenceSpecificationName** 欄位的值是 `neo-00011122-2333-4445-5566-677788899900`，此為用 Neo 編譯的模型。欄位中的值是用來編譯和最佳化模型的 Neo 任務名稱。


| EndpointName | InstanceType | InitialInstanceCount | EnvironmentParameters | CostPerHour | CostPerInference | MaxInvocations | ModelLatency | InferenceSpecificationName | 
| --- | --- | --- | --- | --- | --- | --- | --- | --- | 
| sm-epc-example-000111222 | ml.c5.9xlarge | 1 | [] | 1.836 | 9.15E-07 | 33456 | 7 | 無 | 
| sm-epc-example-111222333 | ml.c5.2xlarge | 1 | [] | 0.408 | 2.11E-07 | 32211 | 21 | 無 | 
| sm-epc-example-222333444 | ml.c5.xlarge | 1 | [] | 0.204 | 1.86E-07 | 18276 | 92 | 無 | 
| sm-epc-example-333444555 | ml.c5.xlarge | 1 | [] | 0.204 | 1.60E-07 | 21286 | 42 | neo-00011122-2333-4445-5566-677788899900 | 

## 開始使用
<a name="inference-recommender-neo-compilation-get-started"></a>

建立包含 Neo 最佳化建議之 Inference Recommender 任務的一般步驟如下：
+ 準備您的機器學習 (ML) 模型進行編譯。如需詳細資訊，請參閱 Neo 文件中的[準備編譯模型](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-compilation-preparing-model.html)。
+ 將您的模型封裝在模型封存 (`.tar.gz` 檔案)。
+ 建立範例承載封存。
+ 在 SageMaker 模型註冊表中註冊模型。
+ 建立 Inference Recommender 任務。
+ 檢視 Inference Recommender 任務的結果並選擇組態。
+ 偵錯編譯失敗 (如果有的話)。如需詳細資訊，請參閱 [Neo 編譯錯誤的故障診斷](https://docs.aws.amazon.com/sagemaker/latest/dg/neo-troubleshooting-compilation.html)。

如需示範先前工作流程以及如何使用 XGBoost 取得 Neo 最佳化建議的範例，請參閱下列[範例筆記本](https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker-inference-recommender/xgboost/xgboost-inference-recommender.ipynb)。如需示範如何使用 TensorFlow 取得 Neo 最佳化建議的範例，請參閱下列[範例筆記本](https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker-inference-recommender/inference-recommender.ipynb)。