本文為英文版的機器翻譯版本，如內容有任何歧義或不一致之處，概以英文版為準。 # 自訂標籤工作流程這些主題可協助您設定使用自訂標籤範本的 Ground Truth 標籤工作。自訂標記範本可讓您建立工作者用來標記資料的自訂工作者入口網站 UI。您可利用 HTML、CSS、JavaScript、[Liquid 範本語言](https://shopify.github.io/liquid/)，以及 [Crowd HTML 元素](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-ui-template-reference.html)來建立範本。 ## 概觀如果這是您第一次在 Ground Truth 中建立自訂標籤工作流程，以下是必要步驟摘要。 1. *設定您的人力資源* – 若要建立自訂標籤工作流程，您需要人力資源。本主題說明如何設定人力資源。 1. *建立自訂範本* – 若要建立自訂範本，您必須將輸入資訊清單檔案中的資料正確對應至範本中的變數。 1. *使用選用的處理 Lambda 函式* – 控制如何將輸入資訊清單中的資料新增至工作者範本，以及如何將工作者註釋記錄到任務的輸出檔案中。本主題也提供三種端對端示範，協助您了解如何使用自訂標籤範本。 **注意** 下方連結中的範例都包括註釋前和註釋後 Lambda 函式。這些 Lambda 函數是選用的。 + [示範範本：使用 `crowd-bounding-box` 註釋影像](sms-custom-templates-step2-demo1.md) + [示範範本：使用 `crowd-classifier` 標籤意圖](sms-custom-templates-step2-demo2.md) + [使用 Amazon SageMaker Ground Truth 建置自訂資料標籤工作流程](https://aws.amazon.com/blogs/machine-learning/build-a-custom-data-labeling-workflow-with-amazon-sagemaker-ground-truth/) **Topics** + [概觀](#sms-custom-templates-overview) + [設定您的人力資源](sms-custom-templates-step1.md) + [建立自訂工作者任務範本](sms-custom-templates-step2.md) + [新增包含 Liquid 的自動化](sms-custom-templates-step2-automate.md) + [使用在自訂標記工作流程中處理資料 AWS Lambda](sms-custom-templates-step3.md) + [示範範本：使用 `crowd-bounding-box` 註釋影像](sms-custom-templates-step2-demo1.md) + [示範範本：使用 `crowd-classifier` 標籤意圖](sms-custom-templates-step2-demo2.md) + [使用 API 建立自訂工作流程](sms-custom-templates-step4.md) # 設定您的人力資源在此步驟中，您會使用主控台來建立要使用的工作者類型，並針對該工作者類型來設定必要的子選項。此步驟假設您已經完成[入門：使用 Ground Truth 建立週框方塊標籤工作](sms-getting-started.md)一節中的先前步驟，且已將**自訂標籤任務**選擇為**任務類型**。 **設定您的人力資源。** 1. 首先，從 **Worker types** (工作者類型) 選擇選項。目前有三種可用的類型： + **Public** (公有) 使用由 Amazon Mechanical Turk 提供、由獨立承包商組成的按需提供人力資源。這些工作者按任務收費。 + **Private** (私有) 使用您的員工或承包商，來處理您組織不可外洩的資料。 + **廠商**使用專門提供資料標籤服務的第三方供應商，可透過 AWS Marketplace 取得。 1. 如果您選擇 **Public** (公有) 選項，則會要求您設定 **number of workers per dataset object (每個資料集物件的工作者數目)**。讓多個工作者在相同物件執行相同任務，可提高結果的準確度。預設為三個。您可以根據需要來提高或降低該數目。也會要求您使用下拉式功能表來設定 **price per task** (每任務價格)。功能表會根據需要多長的時間來完成任務，以建議價格點。推薦的判斷方法，是先使用**私有**人力資源對您的任務進行簡短測試。此測試可預估完成任務所需的實際時間。接下來，您可以在 **Price per task** (每任務價格) 功能表中選取您的預估範圍。如果您的平均時間超過 5 分鐘，請考慮將您的任務拆分成較小的單位。 ## 下一頁 [建立自訂工作者任務範本](sms-custom-templates-step2.md) # 建立自訂工作者任務範本若要建立自訂標籤工作，您需要更新工作者任務範本、將資訊清單檔案中的輸入資料對應至範本使用的變數，並將輸出資料對應至 Amazon S3。若要進一步了解使用 Liquid 自動化的進階功能，請參閱 [新增包含 Liquid 的自動化](sms-custom-templates-step2-automate.md)。下列各節描述每個必要步驟。 ## 工作者任務範本 *工作任務範本*是 Ground Truth 的檔案，用於自訂工作者使用者介面 (UI)。您可利用 HTML、CSS、JavaScript、[Liquid 範本語言](https://shopify.github.io/liquid/)，以及 [Crowd HTML 元素](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-ui-template-reference.html)來建立工作者任務範本。Liquid 用於自動化範本。Crowd HTML 元素用於納入常見註釋工具，並提供邏輯以便提交給 Ground Truth。您可利用下列主題來了解如何建立工作者任務範本。您可於 [GitHub](https://github.com/aws-samples/amazon-sagemaker-ground-truth-task-uis) 查看 Ground Truth 工作者任務範本的範例資料儲存庫。 ### 在 SageMaker AI 主控台使用基礎工作者任務範本您可利用 Ground Truth 主控台的範本編輯器開始建立範本。此編輯器包含許多預先設計的基礎範本。此編輯器支援自動填入 HTML 和 Crowd HTML 元素程式碼。 **若要存取 Ground Truth 自訂範本編輯器：** 1. 請遵循 [建立標籤工作 (主控台)](sms-create-labeling-job-console.md) 中的說明。 1. 然後針對標籤工作的**任務類型**選取**自訂**。 1. 選擇**下一步**，然後您就能夠在**自訂標籤任務設定**區段存取範本編輯器與基礎範本。 1. (選用) 從 **Templates** (範本) 的下拉式功能表選取基礎範本。如您偏好從頭開始建立範本，請從下拉式功能表選擇 **Custom** (自訂)，即可取得最簡單的範本骨架。透過下一節了解如何在本機視覺化從主控台開發的範本。 #### 在本機視覺化您的工作者任務範本您必須透過主控台測試範本如何處理傳入資料。若要測試範本 HTML 和自訂元素的外觀和風格，您可以使用瀏覽器。 **注意** 變數不會經剖析。在本機檢視內容時，您可能需要將其取代為範例內容。下列程式碼片段範例會載入必要的程式碼，以轉譯自訂 HTML 元素。如果您想要以您偏好的編輯器 (而不是主控台) 來開發範本外觀和風格，請使用此操作。 **Example** ``` ``` ### 建立簡單的 HTML 任務範例現在，您已有基礎工作者任務範本，您可以透過此主題來建立簡單的 HTML 型任務範本。以下是輸入資訊清單檔案的範例項目。 ``` { "source": "This train is really late.", "labels": [ "angry" , "sad", "happy" , "inconclusive" ], "header": "What emotion is the speaker feeling?" } ``` 在 HTML 任務範本中，我們需要將輸入資訊清單檔案的變數對應到範本。輸入資訊清單範例的變數將使用下列語法 **task.input.source**、**task.input.labels** 和 **task.input.header** 進行對應。以下是推文分析的 HTML 工作者任務範本範例。所有任務都以 ` ` 元素開始和結束。如同標準 HTML `

` 元素，所有表單程式碼都應該放置於其中間。Ground Truth 會直接從範本中指定的內容產生工作者的任務，除非您實作註釋前 Lambda。Ground Truth 傳回的 `taskInput` 物件或 [註釋前 Lambda](sms-custom-templates-step3-lambda-requirements.md#sms-custom-templates-step3-prelambda) 是範本中的 `task.input` 物件。針對簡單的推文分析任務，請使用 `` 元素。需要下列屬性： + *名稱* - 輸出變數的名稱。工作者註釋會儲存至輸出資訊清單中的此變數名稱。 + *類別*– 多種可能解答的 JSON 格式陣列。 + *標題* - 註釋工具的標題 `` 元素至少需要下列三個子元素。 + ** – 工作者將根據上述 `categories` 屬性中指定之選項來分類的文字。 + ** – 工具中「檢視完整說明」連結提供的說明。您可以將這項保留空白，但建議您提供完善說明以獲得更佳結果。 + ** – 任務的更簡要描敘，會顯示在工具的側邊欄。您可以將這項保留空白，但建議您提供完善說明以獲得更佳結果。此工具的簡易版本看起來如下所示。變數 **\$1\$1 task.input.source \$1\$1** 會指定輸入資訊清單檔案中的來源資料。**\$1\$1 task.input.labels \$1 to\$1json \$1\$1** 是將陣列變成 JSON 表示法的變數篩選條件範例。`categories` 屬性必須是 JSON。 **Example 使用 `crowd-classifier` 搭配輸入資訊清單 JSON 範例** ```

Try to determine the sentiment the author of the tweet is trying to express. If none seem to match, choose "cannot determine."

Pick the term that best describes the sentiment of the tweet.

``` 在 Ground Truth 標籤工作建立工作流程，您可將程式碼複製並貼到編輯器以便預覽工具，或試用 [CodePen 上此程式碼的示範。](https://codepen.io/MTGT/full/OqBvJw) [https://codepen.io/MTGT/full/OqBvJw](https://codepen.io/MTGT/full/OqBvJw) ## 輸入資料、外部資產和您的任務範本以下章節說明外部資產之使用、輸入資料格式要求，以及何時該考慮使用註釋前 Lambda 函式。 ### 輸入資料格式要求當您建立要在自訂 Ground Truth 標籤工作中使用的輸入資訊清單檔案時，您必須將資料儲存在 Amazon S3 中。輸入資訊清單檔案也必須儲存在執行自訂 Ground Truth 標籤任務 AWS 區域的相同中。除此之外，其可儲存在任何 Amazon S3 儲存貯體，前提是您用來在 Ground Truth 執行自訂標籤工作的 IAM 服務角色可以存取該 Amazon S3 儲存貯體。輸入資訊清單檔案必須使用新行分隔 JSON 或 JSON 行格式。每一行都由標準分行符號 (**\$1n** 或 **\$1r\$1n**) 分隔。每一行也都必須為有效的 JSON 物件。除此之外，資訊清單檔案中的每個 JSON 物件必須包含下列其中一個索引鍵：`source-ref` 或 `source`。鍵的值會解譯為如下： + `source-ref` – 物件來源是數值所指定的 Amazon S3 物件。當物件是二進位物件 (例如映像) 時，請使用此值。 + `source` – 物件的來源即為數值。當物件為文字值時，請使用此值。若要進一步了解如何格式化輸入資訊清單檔案，請參閱 [輸入資訊清單檔案](sms-input-data-input-manifest.md)。 ### 註釋前 Lambda 函式您可以選擇指定*註釋前 Lambda* 函式，以管理在標記前如何處理輸入資訊清單檔案的資料。如果您已指定 `isHumanAnnotationRequired` 鍵/值對，則必須使用註釋前 Lambda 函式。Ground Truth 向註釋前 Lambda 函式傳送 JSON 格式的請求時，會使用下列結構描述。 **Example 透過 `source-ref` 鍵/值對識別的資料物件** ``` { "version": "2018-10-16", "labelingJobArn": arn:aws:lambda:us-west-2:555555555555:function:my-function "dataObject" : { "source-ref": s3://input-data-bucket/data-object-file-name } } ``` **Example 透過 `source` 鍵/值對識別的資料物件** ``` { "version": "2018-10-16", "labelingJobArn" : arn:aws:lambda:us-west-2:555555555555:function:my-function "dataObject" : { "source": Sue purchased 10 shares of the stock on April 10th, 2020 } } ``` 以下是使用 `isHumanAnnotationRequired` 時 Lambda 函式的預期回應。 ``` { "taskInput": { "source": "This train is really late.", "labels": [ "angry" , "sad" , "happy" , "inconclusive" ], "header": "What emotion is the speaker feeling?" }, "isHumanAnnotationRequired": False } ``` ### 使用外部資產 Amazon SageMaker Ground Truth 自訂範本允許嵌入外部指令碼與樣式表。例如，下列程式碼區塊示範如何將位於 `https://www.example.com/my-enhancement-styles.css` 的樣式表新增至範本。 **Example** ``` ``` 如果發生錯誤，請確保您的原始伺服器傳送資產的正確 MIME 類型和編碼標題。例如，遠端指令碼的 MIME 和編碼類型為：`application/javascript;CHARSET=UTF-8`。遠端樣式表的 MIME 和編碼類型為：`text/css;CHARSET=UTF-8`。 ## 輸出資料和您的任務範本下列章節說明自訂標籤工作的輸出資料，以及何時該考慮使用註釋後 Lambda 函式。 ### 輸出資料當您的自訂標籤工作完成時，資料會儲存在建立標籤工作時指定的 Amazon S3 儲存貯體中。資料會儲存在 `output.manifest` 檔案中。 **注意** *labelAttributeName* 是預留位置變數。在您的輸出中，有標籤工作的名稱，或在建立標籤工作時指定的標籤屬性名稱。 + `source` 或 `source-ref` – 要求字串或 S3 URI 工作者進行標記。 + `labelAttributeName` – 包含來自[註釋後 Lambda 函式](sms-custom-templates-step3-lambda-requirements.md#sms-custom-templates-step3-postlambda)的合併標籤內容的字典。如果未指定註釋後 Lambda 函式，則此字典為空白。 + `labelAttributeName-metadata` – Ground Truth 所新增的自訂標籤工作的中繼資料。 + `worker-response-ref` – 儲存資料的儲存貯體的 S3 URI。如果有指定註釋後 Lambda 函式，則不會出現此鍵/值對。在此範例中，JSON 物件經過格式化以便於閱讀；在實際輸出檔案中，JSON 物件位於單一行上。 ``` { "source" : "This train is really late.", "labelAttributeName" : {}, "labelAttributeName-metadata": { # These key values pairs are added by Ground Truth "job_name": "test-labeling-job", "type": "groundTruth/custom", "human-annotated": "yes", "creation_date": "2021-03-08T23:06:49.111000", "worker-response-ref": "s3://amzn-s3-demo-bucket/test-labeling-job/annotations/worker-response/iteration-1/0/2021-03-08_23:06:49.json" } } ``` ### 使用註釋後 Lambda 來合併工作者的結果根據預設，Ground Truth 會在 Amazon S3 中儲存未處理的工作者回應。若要更精準地控制處理回應的方式，您可以指定*註釋後 Lambda 函式*。例如，如果多個工作者已標記相同的資料物件，則可以使用註釋後 Lambda 函式來合併註釋。若要進一步了解如何建立註釋後 Lambda 函式，請參閱 [註釋後 Lambda](sms-custom-templates-step3-lambda-requirements.md#sms-custom-templates-step3-postlambda)。如果您要使用註釋後 Lambda 函式，則必須在 `CreateLabelingJob` 請求中，將其指定為 [https://docs.aws.amazon.com//sagemaker/latest/APIReference/API_AnnotationConsolidationConfig.html](https://docs.aws.amazon.com//sagemaker/latest/APIReference/API_AnnotationConsolidationConfig.html) 的一部分。若要進一步了解註釋合併的運作方式，請參閱 [註釋整合](sms-annotation-consolidation.md)。 # 新增包含 Liquid 的自動化我們的自訂範本系統使用 [Liquid](https://shopify.github.io/liquid/) 進行自動化。Liquid 是開放原始碼內嵌標記語言。在 Liquid，單一大括號與百分比符號之間的文字是說明或*標籤*，用以執行控制流程或重複等操作。雙邊大括號之間的文字是變數，或輸出變數值的*物件*。 Liquid 的最常見用途，是剖析來自輸入資訊清單檔案的資料，並提取相關變數來建立任務。除非有指定註釋前 Lambda，否則 Ground Truth 會自動產生任務。Ground Truth 傳回的 `taskInput` 物件或您的 [註釋前 Lambda](sms-custom-templates-step3-lambda-requirements.md#sms-custom-templates-step3-prelambda) 是範本中的 `task.input` 物件。輸入資訊清單的屬性會以 `event.dataObject` 形式傳遞至您的範本。 **Example 資訊清單資料物件** ``` { "source": "This is a sample text for classification", "labels": [ "angry" , "sad" , "happy" , "inconclusive" ], "header": "What emotion is the speaker feeling?" } ``` **Example 使用變數的範例 HTML** ``` {{ task.input.source }} ``` 請注意上面將 ` | to_json` 新增至 `labels` 屬性。這是將輸入資訊清單陣列變成 JSON 陣列表示法的篩選條件。下一節將說明變數篩選條件。下列清單包含兩種 Liquid 標籤，您可能會發現這些標籤對於自動化範本輸入資料處理很有用。如果您選取下列其中一種標籤類型，系統會將您重新導向至 Liquid 文件。 + [控制流程](https://shopify.github.io/liquid/tags/control-flow/)：包括程式設計邏輯運算子，例如 `if/else`、`unless`、`case/when`。 + [重複](https://shopify.github.io/liquid/tags/iteration/)：可讓您運用 for 迴圈等陳述式重複執行程式碼區塊。如需使用 Liquid 元素建立 for 迴圈的 HTML 範本範例，請參閱 GitHub 的 [translation-review-and-correction.liquid.html](https://github.com/aws-samples/amazon-sagemaker-ground-truth-task-uis/blob/8ae02533ea5a91087561b1daecd0bc22a37ca393/text/translation-review-and-correction.liquid.html)。如需詳細資訊及文件，請前往 [Liquid 首頁](https://shopify.github.io/liquid/)。 ## 變數篩選條件除標準 [Liquid 篩選條件](https://shopify.github.io/liquid/filters/abs/)與動作外，Ground Truth 也提供幾個額外篩選條件。篩選條件的套用方式，是透過將管道 (`|`) 字元放置於變數名稱後，再指定篩選條件名稱。篩選條件可以透過下列形式來串連： **Example** ``` {{ | | }} ``` ### 自動逸出和明確逸出根據預設，輸入將進行 HTML 逸出，以防止變數文字和 HTML 之間的混淆。您可以明確新增 `escape` 篩選條件，讓讀取範本來源的人明白逸出正在進行。 ### escape\$1once `escape_once` 可確保如果您已完成逸出您的程式碼，即不會重新逸出。例如，& 不會變成 &。 ### skip\$1autoescape `skip_autoescape` 當您的內容預定用做 HTML 時會有很幫助。例如，邊界框的完整說明中可能有幾段文字和一些影像。 **請謹慎使用 `skip_autoescape`** 範本中的最佳實務是避免使用 `skip_autoescape` 來傳遞功能性程式碼或標記，除非您非常確定您可以嚴格控制傳遞內容。如果您傳遞使用者輸入，您可能會讓您的工作者面臨跨網站指令碼攻擊。 ### to\$1json `to_json` 會將您提供給 JSON (JavaScript 物件標記法) 的內容進行編碼。如果您提供物件，則會將該物件序列化。 ### grant\$1read\$1access `grant_read_access` 會接受 S3 URI 並將其編碼為 HTTPS URL，含有該資源的短期存取權杖。這能夠將存放在 S3 儲存貯體中的照片、音訊或影片物件顯示給工作者，這些物件在其他情況下不可公開存取。 ### s3\$1presign `s3_presign` 篩選條件的運作方式與 `grant_read_access` 篩選條件相同。`s3_presign` 採用 Amazon S3 URI，並將其編碼為具該資源短期存取字符的 HTTPS URL。這能夠將存放在 S3 儲存貯體中的照片、音訊或影片物件顯示給工作者，這些物件在其他情況下無法公開存取。 **Example 變數篩選條件** 輸入 ``` auto-escape: {{ "Have you read 'James & the Giant Peach'?" }} explicit escape: {{ "Have you read 'James & the Giant Peach'?" | escape }} explicit escape_once: {{ "Have you read 'James & the Giant Peach'?" | escape_once }} skip_autoescape: {{ "Have you read 'James & the Giant Peach'?" | skip_autoescape }} to_json: {{ jsObject | to_json }} grant_read_access: {{ "s3://amzn-s3-demo-bucket/myphoto.png" | grant_read_access }} s3_presign: {{ "s3://amzn-s3-demo-bucket/myphoto.png" | s3_presign }} ``` **Example** Output ``` auto-escape: Have you read 'James & the Giant Peach'? explicit escape: Have you read 'James & the Giant Peach'? explicit escape_once: Have you read 'James & the Giant Peach'? skip_autoescape: Have you read 'James & the Giant Peach'? to_json: { "point_number": 8, "coords": [ 59, 76 ] } grant_read_access: https://s3.amazonaws.com/amzn-s3-demo-bucket/myphoto.png? s3_presign: https://s3.amazonaws.com/amzn-s3-demo-bucket/myphoto.png? ``` **Example 自動化分類範本。** 若要自動執行簡易文字分類範例，請將推文文字取代為變數。自動化的文字分類範本如下，其中已新增自動化。變更/新增內容以粗體顯示。 ```

Try to determine the feeling the author of the tweet is trying to express. If none seem to match, choose "other."

Pick the term best describing the sentiment of the tweet.

``` 先前範例中的推文文字現在已取代為物件。`entry.taskInput` 物件會使用 `source` (或您在註釋前 Lambda 中指定的其他名稱) 作為文字的屬性名稱，並藉由用雙大括號括住，將其直接插入 HTML。 # 使用在自訂標記工作流程中處理資料 AWS Lambda 在本主題中，您可以了解如何在建立自訂標籤工作流程時部署選用 [AWS Lambda](https://aws.amazon.com/lambda/) 函式。您可以指定兩種 Lambda 函式類型，以搭配使用自訂標籤工作流程。 + *註釋前 Lambda*：此函式在傳送每個資料物件給工作者之前，會先進行預先處理。 + *註釋後 Lambda*：此函式會在工作者提交任務之後處理結果。如您為每個資料物件指定多個工作者，則此函式可能包含合併註釋邏輯。如您是 Lambda 與 Ground Truth 新使用者，建議您運用本節頁面，如下所示： 1. 首先，檢閱[使用註釋前與註釋後 Lambda 函式使用 Lambda 函式](sms-custom-templates-step3-lambda-requirements.md)。 1. 然後，利用頁面[新增與 Ground Truth AWS Lambda 搭配使用的必要許可](sms-custom-templates-step3-lambda-permissions.md)了解安全性與權限要求，以便在 Ground Truth 自訂標籤工作運用註釋前和註釋後 Lambda 函式。 1. 接著，您必須先前往 Lambda 主控台或運用 Lambda API 來建立您的函式。您可利用[使用 Ground Truth 範本建立 Lambda 函式](sms-custom-templates-step3-lambda-create.md)一節了解如何建立 Lambda 函式。 1. 了解如何測試您的 Lambda 函式，請參閱[測試註釋前與註釋後 Lambda 函式](sms-custom-templates-step3-lambda-test.md)。 1. 在建立預先處理及事後處理 Lambda 函式之後，請從 **Lambda 函式**區段加以選取，該區段位於 Ground Truth 主控台的自訂 HTML 程式碼編輯器之後。若要了解如何在 `CreateLabelingJob` API 請求運用這些函式，請參閱[建立標籤工作 (API)](sms-create-labeling-job-api.md)。如需自訂標籤工作流程教學課程 (其中包含註釋前及註釋後 Lambda 函式範例)，請參閱 [示範範本：使用 `crowd-bounding-box` 註釋影像](sms-custom-templates-step2-demo1.md)。 **Topics** + [使用註釋前與註釋後 Lambda 函式](sms-custom-templates-step3-lambda-requirements.md) + [新增與 Ground Truth AWS Lambda 搭配使用的必要許可](sms-custom-templates-step3-lambda-permissions.md) + [使用 Ground Truth 範本建立 Lambda 函式](sms-custom-templates-step3-lambda-create.md) + [測試註釋前與註釋後 Lambda 函式](sms-custom-templates-step3-lambda-test.md) # 使用註釋前與註釋後 Lambda 函式您可以利用這些主題了解請求語法以便傳送至註釋前及註釋後 Lambda 函式，以及 Ground Truth 在自訂標籤工作流程中使用的必要回應語法。 **Topics** + [註釋前 Lambda](#sms-custom-templates-step3-prelambda) + [註釋後 Lambda](#sms-custom-templates-step3-postlambda) ## 註釋前 Lambda 在傳送標籤任務給工作者之前，可以調用選用註釋前 Lambda 函式。 Ground Truth 會向 Lambda 函式傳送 JSON 格式請求，以便針對標籤工作與資料物件提供詳細資訊。以下是 2 個 JSON 格式請求範例。 ------ #### [ Data object identified with "source-ref" ] ``` { "version": "2018-10-16", "labelingJobArn": "dataObject" : { "source-ref": } } ``` ------ #### [ Data object identified with "source" ] ``` { "version": "2018-10-16", "labelingJobArn": "dataObject" : { "source": } } ``` ------ 下列清單包含註釋前請求結構描述。每個參數如下所述。 + `version` (字串)：這是 Ground Truth 內部使用的版本號碼。 + `labelingJobArn` (字串)：這是標籤工作的 Amazon Resource Name 或 ARN。當利用 Ground Truth API 作業 (例如 `DescribeLabelingJob`) 時，此 ARN 可用於參照標籤工作。 + `dataObject` (JSON 物件)：此索引鍵包含單一 JSON 行，可能是從您的輸入資訊清單檔案或從 Amazon SNS 傳送。資訊清單中的 JSON line 物件最大為 100 KB，並且可包含各式各樣的資料。對於非常基本的影像註釋工作，`dataObject` JSON 可能僅包含 `source-ref` 鍵，用於識別要註釋的影像。如資料物件 (例如，一行文字) 直接包含在輸入資訊清單檔案，則會識別資料物件為 `source`。如您建立驗證或調整工作，則此行可能包含先前標籤工作的標籤資料與中繼資料。下列標籤範例是註釋前請求範例。這些範例請求的每個參數都會在標籤式表格下方說明。 ------ #### [ Data object identified with "source-ref" ] ``` { "version": "2018-10-16", "labelingJobArn": "arn:aws:sagemaker:us-west-2:111122223333:labeling-job/" "dataObject" : { "source-ref": "s3://input-data-bucket/data-object-file-name" } } ``` ------ #### [ Data object identified with "source" ] ``` { "version": "2018-10-16", "labelingJobArn": "arn:aws:sagemaker::111122223333:labeling-job/" "dataObject" : { "source": "Sue purchased 10 shares of the stock on April 10th, 2020" } } ``` ------ 作為回報，Ground Truth 需要格式如下的回應： **Example 預期傳回的資料** ``` { "taskInput": , "isHumanAnnotationRequired": # Optional } ``` 在上述範例，`` 需要包含自訂工作者任務範本所需的*所有*資料。如果您正在執行其中說明一律保持不變的邊界框任務，則其可能只是影像檔案的 HTTP(S) 或 Amazon S3 資源。如果其為情感分析任務，且不同物件可能有不同的選擇，那麼其是做為字串的物件參考、做為字串陣列的選擇。 **`isHumanAnnotationRequired` 的意義** 這個值是選用的，因為它預設為 `true`。明確設定此值的主要使用案例是，您希望排除此資料物件不被人力工作者標籤。如果資訊清單中有混合的物件，其中有一些需要人工註釋，而有些則不需要，則您可以在每個資料物件中包含 `isHumanAnnotationRequired` 值。您可新增邏輯至註釋前 Lambda，以便動態判斷物件是否需要註釋，並相應設定此布林值。 ### 註釋前 Lambda 函式範例以下基本註釋前 Lambda 函式從初始請求的 `dataObject` 存取 JSON 物件，並以 `taskInput` 參數將其傳回。 ``` import json def lambda_handler(event, context): return { "taskInput": event['dataObject'] } ``` 假設輸入資訊清單檔案使用 `"source-ref"` 來識別資料物件，則與此註釋前 Lambda 相同標籤工作所用的工作者任務範本必須包含如下所示的 Liquid 元素才能擷取 `dataObject`： ``` {{ task.input.source-ref | grant_read_access }} ``` 如果輸入資訊清單檔案利用 `source` 來識別資料物件，則工作任務範本可利用下列內容擷取 `dataObject`： ``` {{ task.input.source }} ``` 下列註釋前 Lambda 範例包含邏輯，可用於識別 `dataObject` 所用的索引鍵，並使用 Lambda return 陳述式的 `taskObject` 來指向該資料物件。 ``` import json def lambda_handler(event, context): # Event received print("Received event: " + json.dumps(event, indent=2)) # Get source if specified source = event['dataObject']['source'] if "source" in event['dataObject'] else None # Get source-ref if specified source_ref = event['dataObject']['source-ref'] if "source-ref" in event['dataObject'] else None # if source field present, take that otherwise take source-ref task_object = source if source is not None else source_ref # Build response object output = { "taskInput": { "taskObject": task_object }, "humanAnnotationRequired": "true" } print(output) # If neither source nor source-ref specified, mark the annotation failed if task_object is None: print(" Failed to pre-process {} !".format(event["labelingJobArn"])) output["humanAnnotationRequired"] = "false" return output ``` ## 註釋後 Lambda 當所有工作者已註釋資料物件，或在已達到 [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanLoopConfig.html#SageMaker-Type-HumanLoopConfig-TaskAvailabilityLifetimeInSeconds](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanLoopConfig.html#SageMaker-Type-HumanLoopConfig-TaskAvailabilityLifetimeInSeconds) 時 (以先發生者為準)，Ground Truth 會將這些註釋傳送到您的註釋後 Lambda。此 Lambda 通常用於[註釋整合](sms-annotation-consolidation.md)。 **注意** 若要查看合併後 Lambda 函式範例，請參閱 GitHub 儲存庫 [aws-sagemaker-ground-truth-recipe](https://github.com/aws-samples/aws-sagemaker-ground-truth-recipe) 的 [annotation\$1consolidation\$1lambda.py](https://github.com/aws-samples/aws-sagemaker-ground-truth-recipe/blob/master/aws_sagemaker_ground_truth_sample_lambda/annotation_consolidation_lambda.py)。下列程式碼區塊包含註釋後請求結構描述。每個參數皆詳述於下列項目符號清單。 ``` { "version": "2018-10-16", "labelingJobArn": , "labelCategories": [], "labelAttributeName": , "roleArn" : , "payload": { "s3Uri": } } ``` + `version` (字串)：Ground Truth 內部使用的版本號碼。 + `labelingJobArn` (字串)：標籤工作的 Amazon Resource Name 或 ARN。當利用 Ground Truth API 作業 (例如 `DescribeLabelingJob`) 時，此 ARN 可用於參照標籤工作。 + `labelCategories` (字串清單)：包括您在主控台指定的標籤類別與其他屬性，或包含在標籤類別組態檔案的標籤類別與其他屬性。 + `labelAttributeName` (字串)：標籤工作的名稱，或在建立標籤工作時指定的標籤屬性名稱。 + `roleArn` (字串)：在建立標籤工作時針對 IAM 執行角色指定的 Amazon Resource Name (ARN)。 + `payload` (JSON 物件)：包含 `s3Uri` 索引鍵的 JSON，可識別該資料物件在 Amazon S3 的註釋資料位置。下列第二個程式碼區塊顯示此註釋檔案範例。下列程式碼區塊包含註釋後請求範例。此範例請求的每個參數都會在程式碼區塊下方說明。 **Example 註釋後 Lambda 請求** ``` { "version": "2018-10-16", "labelingJobArn": "arn:aws:sagemaker:us-west-2:111122223333:labeling-job/labeling-job-name", "labelCategories": ["Ex Category1","Ex Category2", "Ex Category3"], "labelAttributeName": "labeling-job-attribute-name", "roleArn" : "arn:aws:iam::111122223333:role/role-name", "payload": { "s3Uri": "s3://amzn-s3-demo-bucket/annotations.json" } } ``` **注意** 如無工作者處理資料物件，且已達到 `TaskAvailabilityLifetimeInSeconds`，則會標記資料物件為失敗，也不會納入註釋後 Lambda 調用。下列程式碼區塊包含承載結構描述。這是由註釋後 Lambda 請求 `payload` JSON 物件的 `s3Uri` 參數指示的檔案。例如，如先前的程式碼區塊是註釋後 Lambda 請求，則下列註釋檔案位於 `s3://amzn-s3-demo-bucket/annotations.json`。每個參數皆詳述於下列項目符號清單。 **Example 註釋檔案** ``` [ { "datasetObjectId": , "dataObject": { "s3Uri": , "content": }, "annotations": [{ "workerId": , "annotationData": { "content": , "s3Uri": } }] } ] ``` + `datasetObjectId` (字串)：針對您傳送至標籤工作的每個資料物件識別 Ground Truth 向其指派的唯一 ID。 + `dataObject` (JSON 物件)：已標籤的資料物件。如資料物件包含在輸入資訊清單檔案，且利用 `source` 索引鍵 (例如字串) 加以識別，則 `dataObject` 會包含可識別該資料物件的 `content` 索引鍵。否則，該資料物件的位置 (例如，連結或 S3 URI) 會識別為 `s3Uri`。 + `annotations` (JSON 物件清單)：此清單包含工作者針對此 `dataObject` 所提交每個註釋的單一 JSON 物件。單一 JSON 物件包含唯一 `workerId`，可用來識別提交該註釋的工作者。`annotationData` 索引鍵包含以下其中一項： + `content` (字串)：包含註釋資料。 + `s3Uri` (字串)：包含可識別註釋資料位置的 S3 URI。下表包含在不同類型註釋的承載您可能找到的內容範例。 ------ #### [ Named Entity Recognition Payload ] ``` [ { "datasetObjectId": "1", "dataObject": { "content": "Sift 3 cups of flour into the bowl." }, "annotations": [ { "workerId": "private.us-west-2.ef7294f850a3d9d1", "annotationData": { "content": "{\"crowd-entity-annotation\":{\"entities\":[{\"endOffset\":4,\"label\":\"verb\",\"startOffset\":0},{\"endOffset\":6,\"label\":\"number\",\"startOffset\":5},{\"endOffset\":20,\"label\":\"object\",\"startOffset\":15},{\"endOffset\":34,\"label\":\"object\",\"startOffset\":30}]}}" } } ] } ] ``` ------ #### [ Semantic Segmentation Payload ] ``` [ { "datasetObjectId": "2", "dataObject": { "s3Uri": "s3://amzn-s3-demo-bucket/gt-input-data/images/bird3.jpg" }, "annotations": [ { "workerId": "private.us-west-2.ab1234c5678a919d0", "annotationData": { "content": "{\"crowd-semantic-segmentation\":{\"inputImageProperties\":{\"height\":2000,\"width\":3020},\"labelMappings\":{\"Bird\":{\"color\":\"#2ca02c\"}},\"labeledImage\":{\"pngImageData\":\"iVBOR...\"}}}" } } ] } ] ``` ------ #### [ Bounding Box Payload ] ``` [ { "datasetObjectId": "0", "dataObject": { "s3Uri": "s3://amzn-s3-demo-bucket/gt-input-data/images/bird1.jpg" }, "annotations": [ { "workerId": "private.us-west-2.ab1234c5678a919d0", "annotationData": { "content": "{\"boundingBox\":{\"boundingBoxes\":[{\"height\":2052,\"label\":\"Bird\",\"left\":583,\"top\":302,\"width\":1375}],\"inputImageProperties\":{\"height\":2497,\"width\":3745}}}" } } ] } ] ``` ------ 您的註釋後 Lambda 函式可能包含類似以下內容的邏輯，以便循環檢視並存取請求所包含的所有註釋。如需完整範例，請參閱 GitHub 儲存庫 [aws-sagemaker-ground-truth-recipe](https://github.com/aws-samples/aws-sagemaker-ground-truth-recipe) 的 [annotation\$1consolidation\$1lambda.py](https://github.com/aws-samples/aws-sagemaker-ground-truth-recipe/blob/master/aws_sagemaker_ground_truth_sample_lambda/annotation_consolidation_lambda.py)。在此 GitHub 範例，您必須新增自己的註釋合併邏輯。 ``` for i in range(len(annotations)): worker_id = annotations[i]["workerId"] annotation_content = annotations[i]['annotationData'].get('content') annotation_s3_uri = annotations[i]['annotationData'].get('s3uri') annotation = annotation_content if annotation_s3_uri is None else s3_client.get_object_from_s3( annotation_s3_uri) annotation_from_single_worker = json.loads(annotation) print("{} Received Annotations from worker [{}] is [{}]" .format(log_prefix, worker_id, annotation_from_single_worker)) ``` **提示** 當您執行資料合併演算法時，您可利用 AWS 資料庫服務來儲存結果，或者您可將處理結果傳回 Ground Truth。您傳回 Ground Truth 的資料會儲存在 S3 儲存貯體 (已於標籤工作設定期間指定用於輸出) 的合併註釋清單檔案。作為回報，Ground Truth 需要格式如下的回應： **Example 預期傳回的資料** ``` [ { "datasetObjectId": , "consolidatedAnnotation": { "content": { "": { # ... label content } } } }, { "datasetObjectId": , "consolidatedAnnotation": { "content": { "": { # ... label content } } } } . . . ] ``` 此時，所有傳送到您 S3 儲存貯體的資料 (除了 `datasetObjectId` 之外) 位於 `content` 物件。當您傳回 `content` 的註釋時，會在工作輸出資訊清單檔案產生項目，如下所示： **Example 輸出資訊清單中的標籤格式** ``` { "source-ref"/"source" : "", "": { # ... label content from you }, "-metadata": { # This will be added by Ground Truth "job_name": , "type": "groundTruth/custom", "human-annotated": "yes", "creation_date": # Timestamp of when received from Post-labeling Lambda } } ``` 由於自訂範本及其收集的資訊可能相當複雜，因此 Ground Truth 不會針對資料提供進一步處理。 # 新增與 Ground Truth AWS Lambda 搭配使用的必要許可您可能需要設定以下部分或所有內容才能建立 AWS Lambda 並搭配使用 Ground Truth。 + 您需要授予 IAM 角色或使用者（統稱為 IAM 實體）使用建立註釋前和註釋後 Lambda 函數的許可 AWS Lambda，並在建立標籤任務時選擇它們。 + 在設定標籤工作時指定的 IAM 執行角色需要權限才能調用註釋前與註釋後 Lambda 函式。 + 註釋後 Lambda 函式可能需要權限才能存取 Amazon S3。請參閱以下各節，了解如何建立 IAM 實體並授予上述權限。 **Topics** + [授予建立和選取 AWS Lambda 函數的許可](#sms-custom-templates-step3-postlambda-create-perms) + [授予 IAM 執行角色叫用 AWS Lambda 函數的許可](#sms-custom-templates-step3-postlambda-execution-role-perms) + [授予註釋後 Lambda 權限以便存取註釋](#sms-custom-templates-step3-postlambda-perms) ## 授予建立和選取 AWS Lambda 函數的許可如果您不需要精細的許可來開發註釋前和註釋後 Lambda 函數，您可以將 AWS 受管政策連接到`AWSLambda_FullAccess`使用者或角色。此政策授予廣泛的許可，以使用所有 Lambda 功能，以及在 Lambda 與之互動的其他 AWS 服務中執行動作的許可。若要為安全敏感的使用案例建立更精細的政策，請參閱《 AWS Lambda 開發人員指南》中的 [Lambda 的身分型 IAM 政策](https://docs.aws.amazon.com/lambda/latest/dg/access-control-identity-based.html)文件，了解如何建立適合您使用案例的 IAM 政策。 **使用 Lambda 主控台政策** 如果您想要授予 IAM 實體使用 Lambda 主控台的許可，請參閱《 AWS Lambda 開發人員指南》中的[使用 Lambda 主控台](https://docs.aws.amazon.com/lambda/latest/dg/security_iam_id-based-policy-examples.html#security_iam_id-based-policy-examples-console)。此外，如果您希望使用者能夠在 Lambda 主控台 AWS Serverless Application Repository 中使用存取和部署 Ground Truth 啟動者註釋前和註釋後函數，您必須指定要部署函數的 *``* （這應該與用來建立標記任務 AWS 的區域相同），並將下列政策新增至 IAM 角色。 ------ #### [ JSON ] **** ``` { "Version":"2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "serverlessrepo:ListApplicationVersions", "serverlessrepo:GetApplication", "serverlessrepo:CreateCloudFormationTemplate" ], "Resource": "arn:aws:serverlessrepo:us-east-1:838997950401:applications/aws-sagemaker-ground-truth-recipe" }, { "Sid": "VisualEditor1", "Effect": "Allow", "Action": "serverlessrepo:SearchApplications", "Resource": "*" } ] } ``` ------ **查看 Ground Truth 主控台 Lambda 函式的政策** 若要在使用者建立自訂標籤工作時，授予 IAM 實體權限以便檢視 Ground Truth 主控台的 Lambda 函式，該實體必須具有[授予 IAM 許可以使用 Amazon SageMaker Ground Truth 主控台](sms-security-permission-console-access.md)所述的權限，包括[自訂標籤工作流程許可](sms-security-permission-console-access.md#sms-security-permissions-custom-workflow)一節所述的權限。 ## 授予 IAM 執行角色叫用 AWS Lambda 函數的許可如您新增 IAM 受管政策 [AmazonSageMakerGroundTruthExecution](https://console.aws.amazon.com/iam/home?#/policies/arn:aws:iam::aws:policy/AmazonSageMakerGroundTruthExecution) 至用於建立標籤工作的 IAM 執行角色，則此角色擁有權限可列出及調用 Lambda 函式，前提是其函式名稱必須具下列其中一個字串：`GtRecipe`、`SageMaker`、`Sagemaker`、`sagemaker` 或 `LabelingFunction`。如註釋前或註釋後 Lambda 函式名稱未包含前段所述其中一個術語，或者您需要比 `AmazonSageMakerGroundTruthExecution` 受管政策所含更精細的權限，則可新增類似下列政策，以便授予執行角色權限來調用註釋前與註釋後函式。 ------ #### [ JSON ] **** ``` { "Version":"2012-10-17", "Statement": [ { "Effect": "Allow", "Action": "lambda:InvokeFunction", "Resource": [ "arn:aws:lambda:us-east-1:111122223333:function:", "arn:aws:lambda:us-east-1:111122223333:function:" ] } ] } ``` ------ ## 授予註釋後 Lambda 權限以便存取註釋如[註釋後 Lambda](sms-custom-templates-step3-lambda-requirements.md#sms-custom-templates-step3-postlambda)所述，註釋後 Lambda 請求包括註釋資料的 Amazon S3 位置。此位置由 `payload` 物件的 `s3Uri` 字串識別。若要處理註釋 (即使是簡單的傳遞函式)，您也需要向註釋後 [Lambda 執行角色](https://docs.aws.amazon.com/lambda/latest/dg/lambda-intro-execution-role.html)指派必要權限，以便從 Amazon S3 讀取檔案。您可利用許多方法來設定 Lambda 並存取 Amazon S3 的註釋資料。兩種常見方法是： + 允許 Lambda 執行角色擔任 SageMaker AI 執行角色 (於註釋後 Lambda 請求的 `roleArn` 識別)。此 SageMaker AI 執行角色是用來建立標籤工作的角色，可存取儲存註釋資料的 Amazon S3 輸出儲存貯體。 + 授予 Lambda 執行角色權限，以便直接存取 Amazon S3 輸出儲存貯體。請參閱下列各節來了解如何設定這些選項。 **授予 Lambda 權限以便擔任 SageMaker AI 執行角色** 若要允許 Lambda 函式擔任 SageMaker AI 執行角色，您必須附加政策至 Lambda 函式的執行角色，並修改 SageMaker AI 執行角色的信任關係，以便允許 Lambda 擔任該角色。 1. [附加下列 IAM 政策](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html)至 Lambda 函式的執行角色，以便擔任 `Resource` 所識別的 SageMaker AI 執行角色。將 `222222222222` 取代為 [AWS 帳戶 ID](https://docs.aws.amazon.com/general/latest/gr/acct-identifiers.html)。將 `sm-execution-role` 取代為所擔任角色的名稱。 ------ #### [ JSON ] **** ``` { "Version":"2012-10-17", "Statement": { "Effect": "Allow", "Action": "sts:AssumeRole", "Resource": "arn:aws:iam::222222222222:role/sm-execution-role" } } ``` ------ 1. [修改 SageMaker AI 執行角色的信任政策](https://docs.aws.amazon.com/IAM/latest/UserGuide/roles-managingrole-editing-console.html#roles-managingrole_edit-trust-policy)，並納入下列 `Statement`。將 `222222222222` 取代為 [AWS 帳戶 ID](https://docs.aws.amazon.com/general/latest/gr/acct-identifiers.html)。將 `my-lambda-execution-role` 取代為所擔任角色的名稱。 ------ #### [ JSON ] **** ``` { "Version":"2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "AWS": "arn:aws:iam::222222222222:role/my-lambda-execution-role" }, "Action": "sts:AssumeRole" } ] } ``` ------ **授予 Lambda 執行角色權限以便存取 S3** 您可新增類似下列內容的政策至註釋後 Lambda 函式執行角色，以授予其 S3 讀取權限。使用您在建立標籤工作時指定的輸出儲存貯體名稱來取代 *amzn-s3-demo-bucket*。 ------ #### [ JSON ] **** ``` { "Version":"2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:GetObject" ], "Resource": "arn:aws:s3:::amzn-s3-demo-bucket/*" } ] } ``` ------ 若要新增 S3 讀取權限至 Lambda 主控台的 Lambda 執行角色，請採用下列程序。 **新增 S3 讀取權限至註釋後 Lambda：** 1. 開啟 Lambda 主控台的 [**Functions** (函式) 頁面](https://console.aws.amazon.com/lambda/home#/functions)。 1. 選擇註釋後函式的名稱。 1. 選擇 **Configuration** (組態)，然後選擇 **Permissions** (權限)。 1. 選取 **Role name** (角色名稱)，該角色的摘要頁面立即在 IAM 主控台的新索引標籤開啟。 1. 選取 **Attach policies** (附加政策)。 1. 執行以下任意一項： + 搜尋並選取 **`AmazonS3ReadOnlyAccess`**，以便授予函式權限來讀取帳戶的所有儲存貯體與物件。 + 如您需要更精細的權限，請選取 **Create policy** (建立政策)，然後利用上一節的政策範例來建立政策。請注意，在建立政策之後，您必須導覽回執行角色摘要頁面。 1. 如您採用 `AmazonS3ReadOnlyAccess` 受管政策，請選取 **Attach policy** (附加政策)。如您已建立新政策，請導覽回 Lambda 執行角色摘要頁面，並附加您剛建立的政策。 # 使用 Ground Truth 範本建立 Lambda 函式您可以使用 Lambda 主控台 AWS CLI、或 AWS SDK，以您選擇的支援程式設計語言來建立 Lambda 函數。使用 AWS Lambda 開發人員指南進一步了解下列每個選項： + 若要了解如何使用主控台建立 Lambda 函式，請參閱[使用主控台建立 Lambda 函式](https://docs.aws.amazon.com/lambda/latest/dg/getting-started-create-function.html)。 + 若要了解如何使用建立 Lambda 函數 AWS CLI，請參閱[搭配 AWS 命令列界面使用 AWS Lambda](https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-awscli.html)。 + 選取目錄的相關章節，進一步了解如何透過您選擇的語言運用 Lambda。例如，選取[運用 Python](https://docs.aws.amazon.com/lambda/latest/dg/lambda-python.html) 來進一步了解如何搭配適用於 Python (Boto3) 的 AWS SDK使用 Lambda。 Ground Truth 透過 AWS Serverless Application Repository (SAR) *配方*提供註釋前和註釋後範本。利用以下程序在 Lambda 主控台選取 Ground Truth 配方。 **利用 Ground Truth SAR 配方來建立註釋前與註釋後 Lambda 函式：** 1. 開啟 Lambda 主控台的 [**Functions** (函式) 頁面](https://console.aws.amazon.com/lambda/home#/functions)。 1. 選取 **Create function** (建立函式)。 1. 選取 **Browse serverless app repository** (瀏覽無伺服器應用程式儲存庫)。 1. 在搜尋文字方塊輸入 **aws-sagemaker-ground-truth-recipe**，然後選取該應用程式。 1. 選取 **Deploy** (部署)。該應用程式可能需要幾分鐘時間來部署。在部署應用程式之後，Lambda 主控台的 **Functions** (函式) 區段會出現兩個函式：`serverlessrepo-aws-sagema-GtRecipePreHumanTaskFunc-` 與 `serverlessrepo-aws-sagema-GtRecipeAnnotationConsol-`。 1. 選取其中一個函式，然後在 **Code** (程式碼) 區段新增您的自訂邏輯。 1. 在完成變更之後，選取 **Deploy** (部署) 即可進行部署。 # 測試註釋前與註釋後 Lambda 函式您可在 Lambda 主控台測試註釋前與註釋後 Lambda 函式。如您是 Lambda 的新使用者，可透過 AWS Lambda 開發人員指南的主控台利用[建立 Lambda 函式](https://docs.aws.amazon.com/lambda/latest/dg/getting-started-create-function.html#gettingstarted-zip-function)教學課程來了解如何在主控台測試或*調用* Lambda 函式。您可以使用此頁面上的區段，了解如何測試透過 AWS Serverless Application Repository (SAR) 提供的 Ground Truth 註釋前和註釋後範本。 **Topics** + [先決條件](#sms-custom-templates-step3-lambda-test-pre) + [測試註釋前 Lambda 函式](#sms-custom-templates-step3-lambda-test-pre-annotation) + [測試註釋後 Lambda 函式](#sms-custom-templates-step3-lambda-test-post-annotation) ## 先決條件您必須執行下列動作，才能採用本頁所述的測試。 + 您需要 Lambda 主控台的存取權，且您需要權限以便建立及調用 Lambda 函式。若要了解如何設定這些權限，請參閱[授予建立和選取 AWS Lambda 函數的許可](sms-custom-templates-step3-lambda-permissions.md#sms-custom-templates-step3-postlambda-create-perms)。 + 如您尚未部署 Ground Truth SAR 配方，請利用[使用 Ground Truth 範本建立 Lambda 函式](sms-custom-templates-step3-lambda-create.md)的程序來執行此操作。 + 若要測試註釋後 Lambda 函式，您必須在 Amazon S3 擁有包含範例註釋資料的資料檔案。對於簡單測試，您可將下列程式碼複製並貼到檔案，然後另存新檔為 `sample-annotations.json` 並[將其上傳到 Amazon S3](https://docs.aws.amazon.com/AmazonS3/latest/userguide/upload-objects.html)。請注意此檔案的 S3 URI，您需要此資訊才能設定註釋後 Lambda 測試。 ``` [{"datasetObjectId":"0","dataObject":{"content":"To train a machine learning model, you need a large, high-quality, labeled dataset. Ground Truth helps you build high-quality training datasets for your machine learning models."},"annotations":[{"workerId":"private.us-west-2.0123456789","annotationData":{"content":"{\"crowd-entity-annotation\":{\"entities\":[{\"endOffset\":8,\"label\":\"verb\",\"startOffset\":3},{\"endOffset\":27,\"label\":\"adjective\",\"startOffset\":11},{\"endOffset\":33,\"label\":\"object\",\"startOffset\":28},{\"endOffset\":51,\"label\":\"adjective\",\"startOffset\":46},{\"endOffset\":65,\"label\":\"adjective\",\"startOffset\":53},{\"endOffset\":74,\"label\":\"adjective\",\"startOffset\":67},{\"endOffset\":82,\"label\":\"adjective\",\"startOffset\":75},{\"endOffset\":102,\"label\":\"verb\",\"startOffset\":97},{\"endOffset\":112,\"label\":\"verb\",\"startOffset\":107},{\"endOffset\":125,\"label\":\"adjective\",\"startOffset\":113},{\"endOffset\":134,\"label\":\"adjective\",\"startOffset\":126},{\"endOffset\":143,\"label\":\"object\",\"startOffset\":135},{\"endOffset\":169,\"label\":\"adjective\",\"startOffset\":153},{\"endOffset\":176,\"label\":\"object\",\"startOffset\":170}]}}"}}]},{"datasetObjectId":"1","dataObject":{"content":"Sift 3 cups of flour into the bowl."},"annotations":[{"workerId":"private.us-west-2.0123456789","annotationData":{"content":"{\"crowd-entity-annotation\":{\"entities\":[{\"endOffset\":4,\"label\":\"verb\",\"startOffset\":0},{\"endOffset\":6,\"label\":\"number\",\"startOffset\":5},{\"endOffset\":20,\"label\":\"object\",\"startOffset\":15},{\"endOffset\":34,\"label\":\"object\",\"startOffset\":30}]}}"}}]},{"datasetObjectId":"2","dataObject":{"content":"Jen purchased 10 shares of the stock on Janurary 1st, 2020."},"annotations":[{"workerId":"private.us-west-2.0123456789","annotationData":{"content":"{\"crowd-entity-annotation\":{\"entities\":[{\"endOffset\":3,\"label\":\"person\",\"startOffset\":0},{\"endOffset\":13,\"label\":\"verb\",\"startOffset\":4},{\"endOffset\":16,\"label\":\"number\",\"startOffset\":14},{\"endOffset\":58,\"label\":\"date\",\"startOffset\":40}]}}"}}]},{"datasetObjectId":"3","dataObject":{"content":"The narrative was interesting, however the character development was weak."},"annotations":[{"workerId":"private.us-west-2.0123456789","annotationData":{"content":"{\"crowd-entity-annotation\":{\"entities\":[{\"endOffset\":29,\"label\":\"adjective\",\"startOffset\":18},{\"endOffset\":73,\"label\":\"adjective\",\"startOffset\":69}]}}"}}]}] ``` + 您必須遵循[授予註釋後 Lambda 權限以便存取註釋](sms-custom-templates-step3-lambda-permissions.md#sms-custom-templates-step3-postlambda-perms)的指示，授予註釋後 Lambda 函式的執行角色權限，以便擔任您用來建立標籤工作的 SageMaker AI 執行角色。註釋後 Lambda 函式使用 SageMaker AI 執行角色來存取 S3 的註釋資料檔案 `sample-annotations.json`。 ## 測試註釋前 Lambda 函式使用下列程序來測試部署 Ground Truth AWS Serverless Application Repository (SAR) 配方時建立的註釋前 Lambda 函數。 **測試 Ground Truth SAR 配方註釋前 Lambda 函式** 1. 開啟 Lambda 主控台的 [**Functions** (函式) 頁面](https://console.aws.amazon.com/lambda/home#/functions)。 1. 從 Ground Truth SAR 配方選取部署的註釋前函式。此函式名稱類似 `serverlessrepo-aws-sagema-GtRecipePreHumanTaskFunc-`。 1. 在 **Code source** (程式碼來源) 區段，選取 **Test** (測試) 旁邊的箭頭。 1. 選取 **Configure test event** (設定測試事件)。 1. 保持選取 **Create new test event** (建立新測試事件) 選項。 1. 從 **Event template** (事件範本)，選取 **SageMaker Ground Truth PreHumanTask**。 1. 為測試指定 **Event name** (事件名稱)。 1. 選取**建立**。 1. 再次選取 **Test** (測試) 旁邊的箭頭，您應會看到已選取您建立的測試 (由事件名稱以點表示)。如未選取，請選取。 1. 選取 **Test** (測試)，即可執行測試。在執行測試之後，您可看到 **Execution results** (執行結果)。您應會在 **Function logs** (函式日誌) 看到類似下列回應： ``` START RequestId: cd117d38-8365-4e1a-bffb-0dcd631a878f Version: $LATEST Received event: { "version": "2018-10-16", "labelingJobArn": "arn:aws:sagemaker:us-east-2:123456789012:labeling-job/example-job", "dataObject": { "source-ref": "s3://sagemakerexample/object_to_annotate.jpg" } } {'taskInput': {'taskObject': 's3://sagemakerexample/object_to_annotate.jpg'}, 'isHumanAnnotationRequired': 'true'} END RequestId: cd117d38-8365-4e1a-bffb-0dcd631a878f REPORT RequestId: cd117d38-8365-4e1a-bffb-0dcd631a878f Duration: 0.42 ms Billed Duration: 1 ms Memory Size: 128 MB Max Memory Used: 43 MB ``` 我們可在此回應看到 Lambda 函式的輸出符合所需的註釋前回應語法： ``` {'taskInput': {'taskObject': 's3://sagemakerexample/object_to_annotate.jpg'}, 'isHumanAnnotationRequired': 'true'} ``` ## 測試註釋後 Lambda 函式使用下列程序來測試部署 Ground Truth AWS Serverless Application Repository (SAR) 配方時建立的註釋後 Lambda 函數。 **測試 Ground Truth SAR 配方註釋後 Lambda** 1. 開啟 Lambda 主控台的 [**Functions** (函式) 頁面](https://console.aws.amazon.com/lambda/home#/functions)。 1. 從 Ground Truth SAR 配方選取部署的註釋後函式。此函式名稱類似 `serverlessrepo-aws-sagema-GtRecipeAnnotationConsol-`。 1. 在 **Code source** (程式碼來源) 區段，選取 **Test** (測試) 旁邊的箭頭。 1. 選取 **Configure test event** (設定測試事件)。 1. 保持選取 **Create new test event** (建立新測試事件) 選項。 1. 在 **Event template** (事件範本)，選取 **SageMaker Ground Truth AnnotationConsolidation**。 1. 為測試指定 **Event name** (事件名稱)。 1. 修改提供的範本程式碼，如下所示： + 將 `roleArn` 的 Amazon Resource Name (ARN) 取代為用於建立標籤工作的 SageMaker AI 執行角色 ARN。 + 將 `s3Uri` 的 S3 URI 取代為您新增至 Amazon S3 `sample-annotations.json` 檔案的 URI。在進行這些修改之後，測試會看起來類似以下內容： ``` { "version": "2018-10-16", "labelingJobArn": "arn:aws:sagemaker:us-east-2:123456789012:labeling-job/example-job", "labelAttributeName": "example-attribute", "roleArn": "arn:aws:iam::222222222222:role/sm-execution-role", "payload": { "s3Uri": "s3://your-bucket/sample-annotations.json" } } ``` 1. 選取**建立**。 1. 再次選取 **Test** (測試) 旁邊的箭頭，您應會看到已選取您建立的測試 (由事件名稱以點表示)。如未選取，請選取。 1. 選擇 **Test** (測試)，即可執行測試。在執行測試之後，您應會在 **Function Logs** (函式日誌) 看見 `-- Consolidated Output --` 區段，其中包含 `sample-annotations.json` 的所有註釋清單。 # 示範範本：使用 `crowd-bounding-box` 註釋影像當您在 Amazon SageMaker Ground Truth 主控台選擇使用自訂範本作為任務類型時，您會進入 **Custom labeling task panel** (自訂標籤任務面板)。那裡有多個基礎範本供您選擇。範本代表一些最常見的任務，並提供範例讓您開始建立自訂標籤任務的範本。如果您不使用主控台，或只是要當作額外的資源，請參閱 [Amazon SageMaker AI Ground Truth 範例任務使用者介面](https://github.com/aws-samples/amazon-sagemaker-ground-truth-task-uis)，以取得各種標籤任務類型的示範範本儲存庫。此示範搭配 **BoundingBox** 範本。此示範也可使用任務前後處理資料所需的 AWS Lambda 函數。在上面的 Github 儲存庫中，若要尋找使用 AWS Lambda 函數的範本，請在範本`{{ task.input. }}`中尋找。 **Topics** + [入門邊界框自訂範本](#sms-custom-templates-step2-demo1-base-template) + [您自己的邊界框自訂範本](#sms-custom-templates-step2-demo1-your-own-template) + [您的資訊清單檔案](#sms-custom-templates-step2-demo1-manifest) + [您的註釋前 Lambda 函式](#sms-custom-templates-step2-demo1-pre-annotation) + [您的註釋後 Lambda 函式](#sms-custom-templates-step2-demo1-post-annotation) + [標籤工作的輸出](#sms-custom-templates-step2-demo1-job-output) ## 入門邊界框自訂範本這是提供給您的入門邊界框範本。 ```

Use the bounding box tool to draw boxes around the requested target of interest:

Draw a rectangle using your mouse over each instance of the target.
Make sure the box does not cut into the target, leave a 2 - 3 pixel margin
When targets are overlapping, draw a box around each object, include all contiguous parts of the target in the box. Do not include parts that are completely overlapped by another object.
Do not include parts of the target that cannot be seen, even though you think you can interpolate the whole shape of the target.
Avoid shadows, they're not considered as a part of the target.
If the target goes off the screen, label up to the edge of the image.

Use the bounding box tool to draw boxes around the requested target of interest.

``` 自訂範本使用 [Liquid 範本語言](https://shopify.github.io/liquid/)，雙大括號括住的每個項目都是變數。註釋前 AWS Lambda 函數應該提供名為的物件，`taskInput`該物件的屬性可以像範本`{{ task.input. }}`一樣存取。 ## 您自己的邊界框自訂範本例如，假設您有一大堆動物照片，而根據先前影像分類任務，您知道影像中的動物種類。現在，您想要在周圍繪製邊界框。在入門範例中，有三個變數：`taskObject`、`header` 和 `labels`。每一個變數都在邊界框的不同部分中表示。 + `taskObject` 是要註釋之照片的 HTTP(S) URL 或 S3 URI。新增的 `| grant_read_access` 是篩選條件，會將 S3 URI 轉換為可短期存取該資源的 HTTPS URL。如果您使用的是 HTTP(S) URL，則不需要。 + `header` 是要標籤之照片上方的文字，例如“在照片中的鳥周圍繪製方塊”。 + `labels` 是陣列，表示為 `['item1', 'item2', ...]`。這些標籤可以由工作者指派給他們所繪製的不同方塊。您可以有一或多個。每個變數名稱都來自註釋前 Lambda 回應中的 JSON 物件。上述名稱僅為建議，請使用任何對您有意義的變數名稱，並提升您團隊中的程式碼可讀性。 **僅在需要時使用變數** 如果欄位不會變更，您可以從範本中移除該變數，將其取代為該文字，否則您必須將該文字重複為資訊清單中每個物件的值，或寫在註釋前 Lambda 函式中。 **Example ：最終自訂邊界框範本** 為求簡化，此範本將具有一個變數、一個標籤和非常基本的說明。假設資訊清單中每個資料物件都有 “animal” 屬性，該值可以在範本的兩個部分中重複使用。 ```

Draw a bounding box around the {{ task.input.animal }} in the image. If there is more than one {{ task.input.animal }} per image, draw a bounding box around the largest one.

The box should be tight around the {{ task.input.animal }} with no more than a couple of pixels of buffer around the edges.

If the image does not contain a {{ task.input.animal }}, check the Nothing to label box.

Draw a bounding box around the {{ task.input.animal }} in each image. If there is more than one {{ task.input.animal }} per image, draw a bounding box around the largest one.

``` 請注意，整個範本都重複使用 `{{ task.input.animal }}`。如果資訊清單中的所有動物名稱都以大寫字母開頭，您可以使用 `{{ task.input.animal | downcase }}`，在句子中需要以小寫表示的地方，納入 Liquid 的其中一個內建篩選條件。 ## 您的資訊清單檔案您的資訊清單檔案應該提供您在範本中使用的變數值。您可以在註釋前 Lambda 中對資訊清單資料進行某些轉換，但如果您不需要，則可以降低錯誤風險，並使 Lambda 執行更快速。下列是範本的範例資訊清單檔案。 ``` {"source-ref": "", "animal": "horse"} {"source-ref": "", "animal" : "bird"} {"source-ref": "", "animal" : "dog"} {"source-ref": "", "animal" : "cat"} ``` ## 您的註釋前 Lambda 函式作為任務設定的一部分，提供函數的 ARN，該 AWS Lambda 函數可以呼叫來處理資訊清單項目並將其傳遞至範本引擎。 **指定 Lambda 函式名稱** 函式的命名最佳實務是在函式名稱中使用以下四個字串之一：`SageMaker`、`Sagemaker`、`sagemaker` 或 `LabelingFunction`。這適用於註釋前函式和註釋後函式。當您使用主控台時，如果您有帳戶擁有的 AWS Lambda 函數，則會提供符合命名需求的函數下拉式清單，供您選擇。在這個非常基本的範例中，您只是從資訊清單傳遞資訊，而沒有做任何額外的處理。此範例註釋前函式是針對 Python 3.7 而撰寫。 ``` import json def lambda_handler(event, context): return { "taskInput": event['dataObject'] } ``` 資訊清單中的 JSON 物件，將做為 `event` 物件的子系提供。`taskInput` 物件內的屬性可當作範本的變數使用，因此只要將 `taskInput` 的值設為 `event['dataObject']`，就會將資訊清單物件的所有值傳遞到您的範本，而無需個別地複製它們。如果您希望將更多值傳送到範本，您可以將它們新增至 `taskInput` 物件。 ## 您的註釋後 Lambda 函式作為任務設定的一部分，提供 AWS Lambda 函數的 ARN，可在工作者完成任務時呼叫該函數來處理表單資料。根據您的需要，此部分可以簡單也可以複雜。如果您希望進行回答整合與評分，您可以套用您選擇的評分和 (或) 整合演算法。如果您希望存放原始資料以進行離線處理，則可以選擇此選項。 **提供許可給您的註釋後 Lambda** 註釋資料會位於由 `payload` 物件中之 `s3Uri` 字串所指定的檔案中。若要處理註釋 (即使是簡單的傳遞函式)，您也需要為 Lambda 指派 `S3ReadOnly` 存取權以便其讀取註釋檔案。在建立 Lambda 的主控台頁面中，捲動至 **Execution role** (執行角色) 面板。選取 **Create a new role from one or more templates** (從一或多個範本建立新角色)。為角色命名。從 **Policy templates** (政策範本) 下拉式清單，選擇 **Amazon S3 object read-only permissions** (Amazon S3 物件唯讀許可)。儲存 Lambda，即會儲存並選取角色。下列範例採用 Python 2.7。 ``` import json import boto3 from urlparse import urlparse def lambda_handler(event, context): consolidated_labels = [] parsed_url = urlparse(event['payload']['s3Uri']); s3 = boto3.client('s3') textFile = s3.get_object(Bucket = parsed_url.netloc, Key = parsed_url.path[1:]) filecont = textFile['Body'].read() annotations = json.loads(filecont); for dataset in annotations: for annotation in dataset['annotations']: new_annotation = json.loads(annotation['annotationData']['content']) label = { 'datasetObjectId': dataset['datasetObjectId'], 'consolidatedAnnotation' : { 'content': { event['labelAttributeName']: { 'workerId': annotation['workerId'], 'boxesInfo': new_annotation, 'imageSource': dataset['dataObject'] } } } } consolidated_labels.append(label) return consolidated_labels ``` 註釋後 Lambda 通常會在事件物件中接收任務結果批次。該批次將是 Lambda 應重複執行的 `payload` 物件。您傳回的內容會是符合 [ API 合約](sms-custom-templates-step3.md)的物件。 ## 標籤工作的輸出您將在指定之目標 S3 儲存貯體中以標籤工作命名的資料夾中找到工作輸出。該輸出位於名為 `manifests` 的子資料夾。針對邊界框任務，您將在輸出資訊清單中找到的輸出看起來會如下列示範。已對範例進行清除以便於列印。每項記錄的實際輸出為單一行。 **Example ：您輸出資訊清單中的 JSON** ``` { "source-ref":"", "": { "workerId":"", "imageSource":"", "boxesInfo":"{\"boundingBox\":{\"boundingBoxes\":[{\"height\":878, \"label\":\"bird\", \"left\":208, \"top\":6, \"width\":809}], \"inputImageProperties\":{\"height\":924, \"width\":1280}}}"}, "-metadata": { "type":"groundTruth/custom", "job_name":"", "human-annotated":"yes" }, "animal" : "bird" } ``` 請注意其他 `animal` 屬性如何從原始資訊清單，傳遞到與 `source-ref` 和標籤資料位於相同層級的輸出資訊清單。輸入資訊清單中的任何屬性 (無論是否用於您的範本中) 都將傳遞到輸出資訊清單。 # 示範範本：使用 `crowd-classifier` 標籤意圖如果您選擇自訂範本，您將進入 **Custom labeling task panel** (自訂標籤任務面板)。那裡有多個代表一些較常用任務的入門範本供您選擇。範本提供起點來開始建置自訂標籤任務的範本。在此示範中，您使用 **Intent Detection** (意圖偵測) 範本，此範本使用 `crowd-classifier` 元素，以及您在任務前後處理資料所需的 AWS Lambda 函式。 **Topics** + [入門意圖偵測自訂範本](#sms-custom-templates-step2-demo2-base-template) + [您的意圖偵測自訂範本](#sms-custom-templates-step2-demo2-your-template) + [您的註釋前 Lambda 函式](#sms-custom-templates-step2-demo2-pre-lambda) + [您的註釋後 Lambda 函式](#sms-custom-templates-step2-demo2-post-lambda) + [您的標籤工作輸出](#sms-custom-templates-step2-demo2-job-output) ## 入門意圖偵測自訂範本這是意圖偵測範本，提供做為起點使用。 ```

Select the most relevant intention expressed by the text.

Example: I would like to return a pair of shoes

Intent: Return

Pick the most relevant intention expressed by the text

``` 自訂範本使用 [Liquid 範本語言](https://shopify.github.io/liquid/)，雙大括號括住的每個項目都是變數。註釋前 AWS Lambda 函數應該提供名為的物件，`taskInput`該物件的屬性可以像範本`{{ task.input. }}`一樣存取。 ## 您的意圖偵測自訂範本入門範本中有兩個變數：在 `crowd-classifier` 元素開頭標籤中的 `task.input.labels` 屬性以及 `classification-target` 區域內容中的 `task.input.utterance`。除非您需要提供具有不同表達用語的不同標籤組，否則避免變數並且只使用文字可以節省處理時間並減少出錯的可能。此示範中使用的範本將移除該變數，但 `to_json` 之類的變數和篩選條件將在 [`crowd-bounding-box` 示範]()一文中詳細說明。 ### 設定元素的樣式這些自訂元素的兩個部分有時候會被忽略：`` 和 `` 區域。良好的指示可以產生良好的結果。在包括這些區域的元素中，`` 會自動出現在工作者畫面左側的 “指示” 窗格中。`` 是連結自該窗格頂端附近的 “檢視完整說明” 連結。按一下連結以開啟模態窗格，其中包含更多詳細的說明。在這些部分不只能使用 HTML、CSS 和 JavaScript，如果您認為自己可以提供一系列能協助工作者以更快的速度和準確性完成任務的強大說明集，我們也鼓勵您這麼做。 **Example 試試看搭配 JSFiddle 的範例** [https://jsfiddle.net/MTGT_Fiddle_Manager/bjc0y1vd/35/](https://jsfiddle.net/MTGT_Fiddle_Manager/bjc0y1vd/35/) 試試[範例 `` 任務](https://jsfiddle.net/MTGT_Fiddle_Manager/bjc0y1vd/35/)。此範例由 JSFiddle 轉譯，因此所有範本變數都取代為硬式編碼的值。按一下 “檢視完整說明” 連結，查看使用延伸 CSS 樣式的一組範例。您可以延伸專案來試驗自己的 CSS 變更、新增範例影像、或新增延伸的 JavaScript 功能。 **Example ：最終自訂意圖偵測範本** 這個範例使用[範例 `` 任務](https://jsfiddle.net/MTGT_Fiddle_Manager/bjc0y1vd/35/)，但搭配 `` 的變數。如果您嘗試在一系列不同的標籤工作中保持一致的 CSS 設計，您可以使用 `` 元素包含外部樣式表，使用方式與其他 HTML 文件相同。 ```

In the statements and questions provided in this exercise, what category of action is the speaker interested in doing?

Example Utterance	Good Choice
When is the Seahawks game on?	eat watch browse
Example Utterance	Bad Choice
When is the Seahawks game on?	buy eat watch

What is the speaker expressing they would like to do next?

``` **Example ：您的資訊清單檔案** 如果為此類文字分類任務手動準備資訊清單檔案，請依照下列方式來格式化您的資料。 ``` {"source": "Roses are red"} {"source": "Violets are Blue"} {"source": "Ground Truth is the best"} {"source": "And so are you"} ``` 這與用於 “[示範範本：使用 `crowd-bounding-box` 註釋影像](sms-custom-templates-step2-demo1.md)” 示範中的資訊清單檔案不同，因為 `source-ref` 被用做屬性名稱而不是 `source`。對於必須轉換成 HTTP 的影像或其他檔案，可使用 `source-ref` 來指定 S3 URI。否則，`source` 應該類似上述文字字串一樣使用。 ## 您的註釋前 Lambda 函式作為任務設定的一部分，提供的 ARN AWS Lambda ，可以呼叫來處理資訊清單項目並將其傳遞至範本引擎。這個 Lambda 函式需要在函式名稱中具有以下四個字串之一：`SageMaker`、`Sagemaker`、`sagemaker` 或 `LabelingFunction`。這適用於註釋前和註釋後 Lambda。使用主控台時，如果您的帳戶擁有 Lambdas，則會提供符合命名要求之函式的下拉式清單，讓您選擇其中一個。在此非常基本的範例中您只會有一個變數，該變數主要是傳遞函式。下列是使用 Python 3.7 的範例標籤前 Lambda。 ``` import json def lambda_handler(event, context): return { "taskInput": event['dataObject'] } ``` `event` 的 `dataObject` 屬性包含來自資訊清單中資料物件的屬性。在此示範中 (簡單的傳遞)，你只是將它直接傳遞為 `taskInput` 值。如果您新增具有這些值的屬性至 `event['dataObject']` 物件，則它們將做為具有 `{{ task.input. }}` 格式的 Liquid 變數提供給 HTML 範本使用。 ## 您的註釋後 Lambda 函式在任務設定過程中，提供 Lambda 函式的 ARN，當工作者完成任務時，就可呼叫來處理表單資料。根據您的需要，此部分可以簡單也可以複雜。如果您希望在資料傳入時進行回答整合與評分，您可以套用您選擇的評分或整合演算法。如果您希望存放原始資料以進行離線處理，則可以選擇此選項。 **設定註釋後 Lambda 函式許可** 註釋資料會位於由 `payload` 物件中之 `s3Uri` 字串所指定的檔案中。若要處理註釋 (即使是簡單的傳遞函式)，您也需要為 Lambda 指派 `S3ReadOnly` 存取權以便其讀取註釋檔案。在建立 Lambda 的主控台頁面中，捲動至 **Execution role** (執行角色) 面板。選取 **Create a new role from one or more templates** (從一或多個範本建立新角色)。為角色命名。從 **Policy templates** (政策範本) 下拉式清單，選擇 **Amazon S3 object read-only permissions** (Amazon S3 物件唯讀許可)。儲存 Lambda，即會儲存並選取角色。下列範例適用於 Python 3.7。 ``` import json import boto3 from urllib.parse import urlparse def lambda_handler(event, context): consolidated_labels = [] parsed_url = urlparse(event['payload']['s3Uri']); s3 = boto3.client('s3') textFile = s3.get_object(Bucket = parsed_url.netloc, Key = parsed_url.path[1:]) filecont = textFile['Body'].read() annotations = json.loads(filecont); for dataset in annotations: for annotation in dataset['annotations']: new_annotation = json.loads(annotation['annotationData']['content']) label = { 'datasetObjectId': dataset['datasetObjectId'], 'consolidatedAnnotation' : { 'content': { event['labelAttributeName']: { 'workerId': annotation['workerId'], 'result': new_annotation, 'labeledContent': dataset['dataObject'] } } } } consolidated_labels.append(label) return consolidated_labels ``` ## 您的標籤工作輸出註釋後 Lambda 通常會在事件物件中接收任務結果批次。該批次將是 Lambda 應重複執行的 `payload` 物件。您將在指定之目標 S3 儲存貯體中以標籤工作命名的資料夾中找到工作輸出。該輸出位於名為 `manifests` 的子資料夾。針對意圖偵測任務，輸出資訊清單中的輸出看起來會如下列示範。此範例已經過清理並加以間隔，以方便人們閱讀。實際輸出會更為壓縮，適合機器閱讀。 **Example ：您輸出資訊清單中的 JSON** ``` [ { "datasetObjectId":"", "consolidatedAnnotation": { "content": { "": { "workerId":"private.us-east-1.XXXXXXXXXXXXXXXXXXXXXX", "result": { "intent": { "label":"" } }, "labeledContent": { "content":"" } } } } }, "datasetObjectId":"", "consolidatedAnnotation": { "content": { "": { "workerId":"private.us-east-1.6UDLPKQZHYWJQSCA4MBJBB7FWE", "result": { "intent": { "label": "" } }, "labeledContent": { "content": "" } } } } }, ... ... ... ] ``` 這應可協助您建立和使用自己的自訂範本。 # 使用 API 建立自訂工作流程建立自訂使用者介面範本 (步驟 2) 並處理 Lambda 函式 (步驟 3) 後，您應該將範本置於 Amazon S3 儲存貯體，其檔案名稱格式應為：`.liquid.html`。使用 [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateLabelingJob.html) 動作來設定您的任務。您將使用存放在 S3 `.liquid.html` 檔案中的自訂範本 ([建立自訂工作者任務範本](sms-custom-templates-step2.md))，做為 [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_HumanTaskConfig.html) 物件中 [https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UiConfig.html](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_UiConfig.html) 物件之 `UiTemplateS3Uri` 欄位的值。對於中所述的 AWS Lambda 任務[使用在自訂標記工作流程中處理資料 AWS Lambda](sms-custom-templates-step3.md)，註釋後任務的 ARN 將用作 `AnnotationConsolidationLambdaArn` 欄位的值，而註釋前任務將用作的值 `PreHumanTaskLambdaArn.`