

本文為英文版的機器翻譯版本，如內容有任何歧義或不一致之處，概以英文版為準。

# 故障診斷 Neo 推論錯誤
<a name="neo-troubleshooting-inference"></a>

本節包含如何防止和解決部署和/或調用端點時可能遇到的一些常見錯誤之資訊。本節適用於 **PyTorch 1.4.0 或較新版本**以及** MXNet v1.7.0 以及較新版本**。
+ 如果您在推論指令碼中定義了一個 `model_fn`，請務必透過 `model_fn()`在有效輸入資料中完成第一個推論 (熱身推論)，否則在呼叫 [https://sagemaker.readthedocs.io/en/stable/api/inference/predictors.html#sagemaker.predictor.Predictor.predict](https://sagemaker.readthedocs.io/en/stable/api/inference/predictors.html#sagemaker.predictor.Predictor.predict)時可能會在終端機上看到以下錯誤訊息：

  ```
  An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (0) from <users-sagemaker-endpoint> with message "Your invocation timed out while waiting for a response from container model. Review the latency metrics for each container in Amazon CloudWatch, resolve the issue, and try again."                
  ```
+ 請務必設定環境變數，如下表所示。如果未設置這些變數，則系統可能會顯示以下錯誤訊息：

  **在終端機上：**

  ```
  An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (503) from <users-sagemaker-endpoint> with message "{ "code": 503, "type": "InternalServerException", "message": "Prediction failed" } ".
  ```

  **在 CloudWatch 中：**

  ```
  W-9001-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - AttributeError: 'NoneType' object has no attribute 'transform'
  ```    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/zh_tw/sagemaker/latest/dg/neo-troubleshooting-inference.html)
+ 建立 Amazon SageMaker AI 模型時，請確定 `MMS_DEFAULT_RESPONSE_TIMEOUT`環境變數設為 500 或較高的值；否則，終端機上可能會看到下列錯誤訊息：

  ```
  An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (0) from <users-sagemaker-endpoint> with message "Your invocation timed out while waiting for a response from container model. Review the latency metrics for each container in Amazon CloudWatch, resolve the issue, and try again."
  ```