기계 번역으로 제공되는 번역입니다. 제공된 번역과 원본 영어의 내용이 상충하는 경우에는 영어 버전이 우선합니다.

# MXNet-Neuron 모델 제공 사용
<a name="tutorial-inferentia-mxnet-neuron-serving"></a>

이 자습서에서는 사전 교육된 MXNet 모델을 사용하여 다중 모델 서버(MMS)로 실시간 이미지를 분류하는 방법을 알아봅니다. MMS는 기계 학습 또는 딥 러닝 프레임워크를 사용하여 교육 받은 딥 러닝 모델을 제공하기 위한 유연하고 사용하기 쉬운 도구입니다. 이 자습서에는 AWS Neuron을 사용한 컴파일 단계와 MXNet을 사용한 MMS 구현이 포함되어 있습니다.

 Neuron SDK에 대한 자세한 내용은 [AWS Neuron SDK 설명서](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/neuron-frameworks/mxnet-neuron/index.html)를 참조하세요.

**Topics**
+ [사전 조건](#tutorial-inferentia-mxnet-neuron-serving-prerequisites)
+ [Conda 환경 활성화](#tutorial-inferentia-mxnet-neuron-serving-activate)
+ [예제 코드 다운로드](#tutorial-inferentia-mxnet-neuron-serving-download)
+ [모델 컴파일](#tutorial-inferentia-mxnet-neuron-serving-compile)
+ [추론 실행](#tutorial-inferentia-mxnet-neuron-serving-inference)

## 사전 조건
<a name="tutorial-inferentia-mxnet-neuron-serving-prerequisites"></a>

 이 자습서를 사용하기 전에 [AWS Neuron을 사용하여 DLAMI 인스턴스 시작](tutorial-inferentia-launching.md)의 설정 단계를 완료해야 합니다. 또한 딥 러닝 및 DLAMI 사용에 익숙해야 합니다.

## Conda 환경 활성화
<a name="tutorial-inferentia-mxnet-neuron-serving-activate"></a>

 다음 명령을 사용하여 MXNet-Neuron conda 환경을 활성화합니다.

```
source activate aws_neuron_mxnet_p36
```

 현재 conda 환경을 종료하려면 다음을 실행합니다.

```
source deactivate
```

## 예제 코드 다운로드
<a name="tutorial-inferentia-mxnet-neuron-serving-download"></a>

 이 예제를 실행하려면 다음 명령을 사용하여 예제 코드를 다운로드합니다.

```
git clone https://github.com/awslabs/multi-model-server
cd multi-model-server/examples/mxnet_vision
```

## 모델 컴파일
<a name="tutorial-inferentia-mxnet-neuron-serving-compile"></a>

다음 콘텐츠를 통해 `multi-model-server-compile.py`라는 Python 스크립트를 생성합니다. 이 스크립트는 ResNet50 모델을 Inferentia 디바이스 대상으로 컴파일합니다.

```
import mxnet as mx
from mxnet.contrib import neuron
import numpy as np

path='http://data.mxnet.io/models/imagenet/'
mx.test_utils.download(path+'resnet/50-layers/resnet-50-0000.params')
mx.test_utils.download(path+'resnet/50-layers/resnet-50-symbol.json')
mx.test_utils.download(path+'synset.txt')

nn_name = "resnet-50"

#Load a model
sym, args, auxs = mx.model.load_checkpoint(nn_name, 0)

#Define compilation parameters#  - input shape and dtype
inputs = {'data' : mx.nd.zeros([1,3,224,224], dtype='float32') }

# compile graph to inferentia target
csym, cargs, cauxs = neuron.compile(sym, args, auxs, inputs)

# save compiled model
mx.model.save_checkpoint(nn_name + "_compiled", 0, csym, cargs, cauxs)
```

 모델을 컴파일하려면 다음 명령을 사용합니다.

```
python multi-model-server-compile.py
```

 출력은 다음과 같아야 합니다.

```
...
[21:18:40] src/nnvm/legacy_json_util.cc:209: Loading symbol saved by previous version v0.8.0. Attempting to upgrade...
[21:18:40] src/nnvm/legacy_json_util.cc:217: Symbol successfully upgraded!
[21:19:00] src/operator/subgraph/build_subgraph.cc:698: start to execute partition graph.
[21:19:00] src/nnvm/legacy_json_util.cc:209: Loading symbol saved by previous version v0.8.0. Attempting to upgrade...
[21:19:00] src/nnvm/legacy_json_util.cc:217: Symbol successfully upgraded!
```

 다음 콘텐츠를 통해 `signature.json`이라는 파일을 생성하여 입력 이름과 셰이프를 구성합니다.

```
{
  "inputs": [
    {
      "data_name": "data",
      "data_shape": [
        1,
        3,
        224,
        224
      ]
    }
  ]
}
```

다음 명령을 사용하여 `synset.txt` 파일을 다운로드합니다. 이 파일은 ImageNet 예측 클래스의 이름 목록입니다.

```
curl -O https://s3.amazonaws.com/model-server/model_archive_1.0/examples/squeezenet_v1.1/synset.txt
```

`model_server_template` 폴더의 템플릿에 따라 사용자 지정 서비스 클래스를 생성합니다. 다음 명령을 사용하여 템플릿을 현재 작업 디렉터리에 복사합니다.

```
cp -r ../model_service_template/* .
```

 `mxnet_model_service.py` 모듈을 편집하여 `mx.cpu()` 컨텍스트를 다음과 같이 `mx.neuron()` 컨텍스트로 바꿉니다. 또한 MXNet-Neuron은 NDARray 및 Gluon API를 지원하지 않으므로 `model_input`에 불필요한 데이터 복사본을 주석 처리해야 합니다.

```
...
self.mxnet_ctx = mx.neuron() if gpu_id is None else mx.gpu(gpu_id)
...
#model_input = [item.as_in_context(self.mxnet_ctx) for item in model_input]
```

 다음 명령을 사용하여 model-archiver로 모델을 패키징합니다.

```
cd ~/multi-model-server/examples
model-archiver --force --model-name resnet-50_compiled --model-path mxnet_vision --handler mxnet_vision_service:handle
```

## 추론 실행
<a name="tutorial-inferentia-mxnet-neuron-serving-inference"></a>

다중 모델 서버를 시작하고 다음 명령을 사용하여 RESTful API를 사용하는 모델을 로드합니다. **neuron-rtd**가 기본 설정으로 실행 중인지 확인합니다.

```
cd ~/multi-model-server/
multi-model-server --start --model-store examples > /dev/null # Pipe to log file if you want to keep a log of MMS
curl -v -X POST "http://localhost:8081/models?initial_workers=1&max_workers=4&synchronous=true&url=resnet-50_compiled.mar"
sleep 10 # allow sufficient time to load model
```

 다음 명령을 사용하여 예제 이미지로 추론을 실행합니다.

```
curl -O https://raw.githubusercontent.com/awslabs/multi-model-server/master/docs/images/kitten_small.jpg
curl -X POST http://127.0.0.1:8080/predictions/resnet-50_compiled -T kitten_small.jpg
```

 출력은 다음과 같아야 합니다.

```
[
  {
    "probability": 0.6388034820556641,
    "class": "n02123045 tabby, tabby cat"
  },
  {
    "probability": 0.16900072991847992,
    "class": "n02123159 tiger cat"
  },
  {
    "probability": 0.12221276015043259,
    "class": "n02124075 Egyptian cat"
  },
  {
    "probability": 0.028706775978207588,
    "class": "n02127052 lynx, catamount"
  },
  {
    "probability": 0.01915954425930977,
    "class": "n02129604 tiger, Panthera tigris"
  }
]
```

 테스트 후 정리하려면 RESTful API를 통해 delete 명령을 실행하고 다음 명령을 사용하여 모델 서버를 중지합니다.

```
curl -X DELETE http://127.0.0.1:8081/models/resnet-50_compiled

multi-model-server --stop
```

 다음 결과가 표시됩니다.

```
{
  "status": "Model \"resnet-50_compiled\" unregistered"
}
Model server stopped.
Found 1 models and 1 NCGs.
Unloading 10001 (MODEL_STATUS_STARTED) :: success
Destroying NCG 1 :: success
```