本文為英文版的機器翻譯版本，如內容有任何歧義或不一致之處，概以英文版為準。

# 適用於 AWS Glue 的AWS CloudFormation
<a name="populate-with-cloudformation-templates"></a>

CloudFormation 是一種可建立許多 AWS 資源的服務。 AWS Glue提供 API 操作以在 中建立物件 AWS Glue Data Catalog。不過，在 CloudFormation 範本檔案中定義和建立AWS Glue物件和其他相關 AWS 資源物件可能更為方便。接下來即可將建立物件的程序自動化。

CloudFormation 提供簡化的語法 - JSON (JavaScript 物件標記法） 或 YAML (YAML 非標記語言） 來表達 AWS 資源的建立。可使用 CloudFormation 範本來定義資料目錄物件，例如資料庫、資料表、分割區、爬蟲程式、分類器及連線。也可定義 ETL 物件，如任務、觸發條件、開發端點。您可以建立範本來描述您想要的所有 AWS 資源，並 CloudFormation 負責為您佈建和設定這些資源。

如需詳細資訊，請參閱*AWS CloudFormation 《 使用者指南*》中的[什麼是 AWS CloudFormation](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/Welcome.html)和[使用 AWS CloudFormation 範本](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/template-guide.html)。

如果您計劃使用與 相容的 CloudFormation 範本AWS Glue，身為管理員，您必須授予其所依賴 CloudFormation 之 AWS 服務和動作的存取權。若要授予建立 CloudFormation 資源的許可，請將下列政策連接至使用 的使用者 CloudFormation：

------
#### [ JSON ]

****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "cloudformation:*"
      ],
      "Resource": "*"
    }
  ]
}
```

------

下表包含 CloudFormation 範本可代表您執行的動作。它包含 AWS 資源類型及其屬性類型的相關資訊連結，您可以將其新增至 CloudFormation 範本。


| AWS Glue 資源 | CloudFormation 範本 | AWS Glue 範例 | 
| --- | --- | --- | 
| 分類器 | [AWS::Glue::Classifier](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-classifier.html) | [Grok 分類器](#sample-cfn-template-classifier)、[JSON 分類器](#sample-cfn-template-classifier-json)、[XML 分類器](#sample-cfn-template-classifier-xml) | 
| 連線 | [AWS::Glue::Connection](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-connection.html) | [MySQL 連線 ](#sample-cfn-template-connection) | 
| 爬蟲程式 | [AWS::Glue::Crawler](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-crawler.html) | [Amazon S3 爬蟲程式](#sample-cfn-template-crawler-s3)、[MySQL 爬蟲程式](#sample-cfn-template-crawler-jdbc) | 
| 資料庫 | [AWS::Glue::Database](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-database.html) | [空資料庫](#sample-cfn-template-database)、[含資料表的資料庫](#sample-cfn-template-db-table-partition)  | 
| 開發端點 | [AWS::Glue::DevEndpoint](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-devendpoint.html) | [開發端點](#sample-cfn-template-devendpoint) | 
| 整合 | [AWS::Glue::Integration](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-integration.html) | [零 ETL 整合](#sample-cfn-template-integration) | 
| 整合資源屬性 | [AWS::Glue::IntegrationResourceProperty](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-integrationresourceproperty.html) | [與整合資源屬性的零 ETL 整合](#sample-cfn-template-integration-resource-property) | 
| 任務 | [AWS::Glue::Job](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-job.html) | [Amazon S3 任務](#sample-cfn-template-job-s3)、[JDBC 任務](#sample-cfn-template-job-jdbc) | 
| 機器學習轉換 | [AWS::Glue::MLTransform](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-mltransform.html) | [機器學習轉換](#sample-cfn-template-machine-learning-transform) | 
| 資料品質規則集 | [AWS::Glue::DataQualityRuleset](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-dataqualityruleset.html) | [資料品質規則集](#sample-cfn-template-data-quality-ruleset)，[使用 EventBridge 排程器的資料品質規則集](#sample-cfn-template-data-quality-ruleset-eventbridge) | 
| 分割區 | [AWS::Glue::Partition](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-partition.html) | [表格分割區](#sample-cfn-template-db-table-partition) | 
| 資料表 | [AWS::Glue::Table](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-table.html) | [資料庫表格](#sample-cfn-template-db-table-partition) | 
| 觸發條件 | [AWS::Glue::Trigger](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-trigger.html) | [隨需觸發](#sample-cfn-template-trigger-ondemand)、[排程觸發](#sample-cfn-template-trigger-scheduled)、[條件式觸發](#sample-cfn-template-trigger-conditional)  | 

若要開始，請使用下方的範本，使用您自己的中繼資料加以自訂。然後使用 CloudFormation 主控台建立 CloudFormation 堆疊，將物件新增至 AWS Glue和任何相關聯的服務。AWS Glue 物件有許多欄位為選填。這些範本會說明 AWS Glue 物件若要正常有效運作需要填寫哪些欄位，或哪些欄位為必填。

 CloudFormation 範本可以是 JSON 或 YAML 格式。這些範例會使用 YAML 以方便閱讀。範例內有評論 (`#`) 會說明範本中定義的值。

CloudFormation 範本可以包含 `Parameters`區段。您可以在範例文字中或在 YAML 檔案提交至 CloudFormation 主控台以建立堆疊時變更本節。範本的 `Resources`區段包含 和AWS Glue相關 物件的定義。 CloudFormation 範本語法定義可能包含包含更詳細屬性語法的屬性。建立 AWS Glue 物件時，並不需要使用到所有屬性。以下範本為用於建立 AWS Glue 物件的常見屬性的範例值。

## AWS Glue 資料庫的範例 CloudFormation 範本
<a name="sample-cfn-template-database"></a>

資料目錄內的 AWS Glue 資料庫含有中繼資料資料表。資料庫由極少的屬性組成，並且可以使用 CloudFormation 範本在 Data Catalog 中建立。以下範例範本旨在協助您開始使用，並說明搭配 使用 CloudFormation 堆疊AWS Glue。此範本範例唯一建立的資源是名為 `cfn-mysampledatabase` 的資料庫。您可以在提交 YAML 時，透過編輯範例的文字或在 CloudFormation 主控台上變更值來變更它。

以下顯示的是用於建立 AWS Glue 資料庫的常見屬性的範例值。如需 的 CloudFormation 資料庫範本詳細資訊AWS Glue，請參閱 [AWS::Glue::Database](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-database.html)。

```
---
AWSTemplateFormatVersion: '2010-09-09'
# Sample CloudFormation template in YAML to demonstrate creating a database named mysampledatabase
# The metadata created in the Data Catalog points to the flights public S3 bucket
#
# Parameters section contains names that are substituted in the Resources section
# These parameters are the names the resources created in the Data Catalog
Parameters:
  CFNDatabaseName:
    Type: String
    Default: cfn-mysampledatabse

# Resources section defines metadata for the Data Catalog
Resources:
# Create an AWS Glue database
  CFNDatabaseFlights:
    Type: AWS::Glue::Database
    Properties:
      # The database is created in the Data Catalog for your account
      CatalogId: !Ref AWS::AccountId   
      DatabaseInput:
        # The name of the database is defined in the Parameters section above
        Name: !Ref CFNDatabaseName	
        Description: Database to hold tables for flights data
        LocationUri: s3://crawler-public-us-east-1/flight/2016/csv/
        #Parameters: Leave AWS database parameters blank
```

## AWS Glue 資料庫、資料表和分割區的範例 CloudFormation 範本
<a name="sample-cfn-template-db-table-partition"></a>

AWS Glue 資料表內含的中繼資料定義了希望以 ETL 指定碼處理的資料之結構和位置。在此資料表中，可定義要用以將資料處理平行化的分區。分區是您以金鑰值定義的資料區塊。舉例而言，使用月份做為金鑰值，則所有一月份的資料都會包含在同一個分區內。在 AWS Glue 中，資料庫可含有資料表，而資料表可包含分區。

以下範例顯示了如何使用 CloudFormation 範本產生資料庫、資料表和分區。基本資料格式為 `csv`，並以逗號 (,) 分隔。由於資料庫必須在含有資料表前先已存在，而資料表必須先存在才可建立分區，因此範本使用 `DependsOn` 陳述式在物件建立時定義其相依性。

此範例中的值定義了某個資料表，表內含有從某個公開的 Amazon S3 儲存貯體取得的航班資料。為了說明之用，僅定義了少許資料欄位和一個分區金鑰。資料目錄中也定義了四個分區。有些用於描述基本資料的儲存的欄位也會顯示於 `StorageDescriptor` 的欄位中。

```
---
AWSTemplateFormatVersion: '2010-09-09'
# Sample CloudFormation template in YAML to demonstrate creating a database, a table, and partitions
# The metadata created in the Data Catalog points to the flights public S3 bucket
#
# Parameters substituted in the Resources section
# These parameters are names of the resources created in the Data Catalog
Parameters:
  CFNDatabaseName:
    Type: String
    Default: cfn-database-flights-1
  CFNTableName1:
    Type: String
    Default: cfn-manual-table-flights-1
# Resources to create metadata in the Data Catalog
Resources:
###
# Create an AWS Glue database
  CFNDatabaseFlights:
    Type: AWS::Glue::Database
    Properties:
      CatalogId: !Ref AWS::AccountId
      DatabaseInput:
        Name: !Ref CFNDatabaseName	
        Description: Database to hold tables for flights data
###
# Create an AWS Glue table
  CFNTableFlights:
    # Creating the table waits for the database to be created
    DependsOn: CFNDatabaseFlights
    Type: AWS::Glue::Table
    Properties:
      CatalogId: !Ref AWS::AccountId
      DatabaseName: !Ref CFNDatabaseName
      TableInput:
        Name: !Ref CFNTableName1
        Description: Define the first few columns of the flights table
        TableType: EXTERNAL_TABLE
        Parameters: {
    "classification": "csv"
  }
#       ViewExpandedText: String
        PartitionKeys:
        # Data is partitioned by month
        - Name: mon
          Type: bigint
        StorageDescriptor:
          OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
          Columns:
          - Name: year
            Type: bigint
          - Name: quarter
            Type: bigint
          - Name: month
            Type: bigint
          - Name: day_of_month
            Type: bigint			
          InputFormat: org.apache.hadoop.mapred.TextInputFormat
          Location: s3://crawler-public-us-east-1/flight/2016/csv/
          SerdeInfo:
            Parameters:
              field.delim: ","
            SerializationLibrary: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
# Partition 1
# Create an AWS Glue partition  
  CFNPartitionMon1:
    DependsOn: CFNTableFlights
    Type: AWS::Glue::Partition
    Properties:
      CatalogId: !Ref AWS::AccountId
      DatabaseName: !Ref CFNDatabaseName
      TableName: !Ref CFNTableName1
      PartitionInput:
        Values:
        - 1
        StorageDescriptor:
          OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
          Columns:
          - Name: mon
            Type: bigint
          InputFormat: org.apache.hadoop.mapred.TextInputFormat
          Location: s3://crawler-public-us-east-1/flight/2016/csv/mon=1/
          SerdeInfo:
            Parameters:
              field.delim: ","
            SerializationLibrary: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
# Partition 2
# Create an AWS Glue partition 
  CFNPartitionMon2:
    DependsOn: CFNTableFlights
    Type: AWS::Glue::Partition
    Properties:
      CatalogId: !Ref AWS::AccountId
      DatabaseName: !Ref CFNDatabaseName
      TableName: !Ref CFNTableName1
      PartitionInput:
        Values:
        - 2
        StorageDescriptor:
          OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
          Columns:
          - Name: mon
            Type: bigint
          InputFormat: org.apache.hadoop.mapred.TextInputFormat
          Location: s3://crawler-public-us-east-1/flight/2016/csv/mon=2/
          SerdeInfo:
            Parameters:
              field.delim: ","
            SerializationLibrary: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
# Partition 3
# Create an AWS Glue partition 
  CFNPartitionMon3:
    DependsOn: CFNTableFlights
    Type: AWS::Glue::Partition
    Properties:
      CatalogId: !Ref AWS::AccountId
      DatabaseName: !Ref CFNDatabaseName
      TableName: !Ref CFNTableName1
      PartitionInput:
        Values:
        - 3
        StorageDescriptor:
          OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
          Columns:
          - Name: mon
            Type: bigint
          InputFormat: org.apache.hadoop.mapred.TextInputFormat
          Location: s3://crawler-public-us-east-1/flight/2016/csv/mon=3/
          SerdeInfo:
            Parameters:
              field.delim: ","
            SerializationLibrary: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
# Partition 4
# Create an AWS Glue partition 
  CFNPartitionMon4:
    DependsOn: CFNTableFlights
    Type: AWS::Glue::Partition
    Properties:
      CatalogId: !Ref AWS::AccountId
      DatabaseName: !Ref CFNDatabaseName
      TableName: !Ref CFNTableName1
      PartitionInput:
        Values:
        - 4
        StorageDescriptor:
          OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
          Columns:
          - Name: mon
            Type: bigint
          InputFormat: org.apache.hadoop.mapred.TextInputFormat
          Location: s3://crawler-public-us-east-1/flight/2016/csv/mon=4/
          SerdeInfo:
            Parameters:
              field.delim: ","
            SerializationLibrary: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
```

## grok AWS Glue 分類器的範例 CloudFormation 範本
<a name="sample-cfn-template-classifier"></a>

AWS Glue 分類器可判斷資料的結構描述。One 類型的自訂分類器會使用 grok 模式配對您的資料。若模式比對符合，則會使用自訂分類器來建立資料表的結構資料，並將 `classification` 設為分類器定義中所設的值。

這個範例所建立的分類器，會建立含有一個名為 `message` 的欄位的資料結構，並將分類設為 `greedy`。

```
---
AWSTemplateFormatVersion: '2010-09-09'
# Sample CFN YAML to demonstrate creating a classifier
#
# Parameters section contains names that are substituted in the Resources section
# These parameters are the names the resources created in the Data Catalog
Parameters:                                                                                                       
# The name of the classifier to be created
  CFNClassifierName:  
    Type: String
    Default: cfn-classifier-grok-one-column-1                                                               	
#
#
# Resources section defines metadata for the Data Catalog
Resources:
# Create classifier that uses grok pattern to put all data in one column and classifies it as "greedy".	
  CFNClassifierFlights:
    Type: AWS::Glue::Classifier   
    Properties:
      GrokClassifier:
        #Grok classifier that puts all data in one column		
        Name: !Ref CFNClassifierName
        Classification: greedy                                                        	   
        GrokPattern: "%{GREEDYDATA:message}"
        #CustomPatterns: none
```

## AWS Glue JSON 分類器的範例 CloudFormation 範本
<a name="sample-cfn-template-classifier-json"></a>

AWS Glue 分類器可判斷資料的結構描述。一種自訂分類器使用 `JsonPath` 字串，定義 JSON 以供分類器分類。AWS Glue 支援`JsonPath` 的運算子子集，如[撰寫 JsonPath 自訂分類器](https://docs.aws.amazon.com/glue/latest/dg/custom-classifier.html#custom-classifier-json)中所述。

如果模式符合，則自訂分類器可用於建立資料表的結構描述。

這個範本所建立的分類器會建立結構描述，每個記錄皆位於物件中的 `Records3` 陣列。

```
---
AWSTemplateFormatVersion: '2010-09-09'
# Sample CFN YAML to demonstrate creating a JSON classifier
#
# Parameters section contains names that are substituted in the Resources section
# These parameters are the names the resources created in the Data Catalog
Parameters:                                                                                                       
# The name of the classifier to be created
  CFNClassifierName:  
    Type: String
    Default: cfn-classifier-json-one-column-1                                                               	
#
#
# Resources section defines metadata for the Data Catalog
Resources:
# Create classifier that uses a JSON pattern.	
  CFNClassifierFlights:
    Type: AWS::Glue::Classifier   
    Properties:
      JSONClassifier:
        #JSON classifier		
        Name: !Ref CFNClassifierName
        JsonPath: $.Records3[*]
```

## AWS Glue XML 分類器的範例 CloudFormation 範本
<a name="sample-cfn-template-classifier-xml"></a>

AWS Glue 分類器可判斷資料的結構描述。一種自訂分類器指定 XML 標籤，以在經剖析的 XML 文件內指定包含各記錄的元素。若模式比對符合，則會使用自訂分類器來建立資料表的結構資料，並將 `classification` 設為分類器定義中所設的值。

這個範例所建立的分類器，會建立一個每個記錄皆位於 `Record` 標籤的結構描述，並將分類設為 `XML`。

```
---
AWSTemplateFormatVersion: '2010-09-09'
# Sample CFN YAML to demonstrate creating an XML classifier
#
# Parameters section contains names that are substituted in the Resources section
# These parameters are the names the resources created in the Data Catalog
Parameters:                                                                                                       
# The name of the classifier to be created
  CFNClassifierName:  
    Type: String
    Default: cfn-classifier-xml-one-column-1                                                               	
#
#
# Resources section defines metadata for the Data Catalog
Resources:
# Create classifier that uses the XML pattern and classifies it as "XML".	
  CFNClassifierFlights:
    Type: AWS::Glue::Classifier   
    Properties:
      XMLClassifier:
        #XML classifier		
        Name: !Ref CFNClassifierName
        Classification: XML   
        RowTag: <Records>
```

## Amazon S3 AWS Glue爬蟲程式的範例 CloudFormation 範本
<a name="sample-cfn-template-crawler-s3"></a>

AWS Glue 爬蟲程式會在資料目錄中建立與資料對應的中繼資料資料表。接下來可使用這些資料表定義做為 ETL 任務的來源和目標。

此範例會在資料目錄中建立一個爬蟲程式、所需的 IAM 角色、AWS Glue 資料庫。在執行此爬蟲程式時，其會擔任 IAM 角色，並為公開的航班資料的資料庫建立一份資料表。資料表建立時會附帶字首「`cfn_sample_1_`」。此範本所建立的 IAM 角色允許全域許可，您可能會想建立一個自訂角色。此分類器並未定義任何自訂的分類器。預設使用 AWS Glue 內建的分類器。

當您將此範例提交至 CloudFormation 主控台時，您必須確認是否要建立 IAM 角色。

```
---
AWSTemplateFormatVersion: '2010-09-09'
# Sample CFN YAML to demonstrate creating a crawler
#
# Parameters section contains names that are substituted in the Resources section
# These parameters are the names the resources created in the Data Catalog
Parameters:                                                                                                       
# The name of the crawler to be created
  CFNCrawlerName:  
    Type: String
    Default: cfn-crawler-flights-1
  CFNDatabaseName:
    Type: String
    Default: cfn-database-flights-1
  CFNTablePrefixName:
    Type: String
    Default: cfn_sample_1_	
#
#
# Resources section defines metadata for the Data Catalog
Resources:
#Create IAM Role assumed by the crawler. For demonstration, this role is given all permissions.
  CFNRoleFlights:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"		 	 	 
        Statement:
          -
            Effect: "Allow"
            Principal:
              Service:
                - "glue.amazonaws.com"
            Action:
              - "sts:AssumeRole"
      Path: "/"
      Policies:
        -
          PolicyName: "root"
          PolicyDocument:
            Version: "2012-10-17"		 	 	 
            Statement:
              -
                Effect: "Allow"
                Action: "*"
                Resource: "*"
 # Create a database to contain tables created by the crawler
  CFNDatabaseFlights:
    Type: AWS::Glue::Database
    Properties:
      CatalogId: !Ref AWS::AccountId
      DatabaseInput:
        Name: !Ref CFNDatabaseName
        Description: "AWS Glue container to hold metadata tables for the flights crawler"
 #Create a crawler to crawl the flights data on a public S3 bucket
  CFNCrawlerFlights:
    Type: AWS::Glue::Crawler
    Properties:
      Name: !Ref CFNCrawlerName
      Role: !GetAtt CFNRoleFlights.Arn
      #Classifiers: none, use the default classifier
      Description: AWS Glue crawler to crawl flights data
      #Schedule: none, use default run-on-demand
      DatabaseName: !Ref CFNDatabaseName
      Targets:
        S3Targets:
          # Public S3 bucket with the flights data
          - Path: "s3://crawler-public-us-east-1/flight/2016/csv"
      TablePrefix: !Ref CFNTablePrefixName
      SchemaChangePolicy:
        UpdateBehavior: "UPDATE_IN_DATABASE"
        DeleteBehavior: "LOG"
      Configuration: "{\"Version\":1.0,\"CrawlerOutput\":{\"Partitions\":{\"AddOrUpdateBehavior\":\"InheritFromTable\"},\"Tables\":{\"AddOrUpdateBehavior\":\"MergeNewColumns\"}}}"
```

## AWS Glue 連線的範例 CloudFormation 範本
<a name="sample-cfn-template-connection"></a>

資料目錄內的 AWS Glue 連線含有連線到 JDBC 資料庫所需的 JDBC 和網路資訊。在連線到 JDBC 資料庫以探索或執行 ETL 任務時，均會用到此資訊。

此範例會建立一個連至 Amazon RDS MySQL 資料庫的連線，名為 `devdb`。使用此連線時，也須提供 IAM 角色、資料庫登入資料、網路連線的值。請參閱範本內的必要欄位詳細資訊。

```
---
AWSTemplateFormatVersion: '2010-09-09'
# Sample CFN YAML to demonstrate creating a connection
#
# Parameters section contains names that are substituted in the Resources section
# These parameters are the names the resources created in the Data Catalog
Parameters:                                                                                                       
# The name of the connection to be created
  CFNConnectionName:  
    Type: String
    Default: cfn-connection-mysql-flights-1
  CFNJDBCString:  
    Type: String
    Default: "jdbc:mysql://xxx-mysql.yyyyyyyyyyyyyy.us-east-1.rds.amazonaws.com:3306/devdb"
  CFNJDBCUser:  
    Type: String
    Default: "master"
  CFNJDBCPassword:  
    Type: String
    Default: "12345678"
    NoEcho: true
#
#
# Resources section defines metadata for the Data Catalog
Resources:
  CFNConnectionMySQL:
    Type: AWS::Glue::Connection
    Properties:
      CatalogId: !Ref AWS::AccountId
      ConnectionInput: 
        Description: "Connect to MySQL database."
        ConnectionType: "JDBC"
        #MatchCriteria: none		
        PhysicalConnectionRequirements:
          AvailabilityZone: "us-east-1d"
          SecurityGroupIdList: 
           - "sg-7d52b812"
          SubnetId: "subnet-84f326ee" 
        ConnectionProperties: {
          "JDBC_CONNECTION_URL": !Ref CFNJDBCString,
          "USERNAME": !Ref CFNJDBCUser,
          "PASSWORD": !Ref CFNJDBCPassword
        }
        Name: !Ref CFNConnectionName
```

## AWS Glue 零 ETL 整合的範例 CloudFormation 範本
<a name="sample-cfn-template-integration"></a>

AWS zero-ETL 是一組全受管整合，可最大限度地減少為常見擷取和複寫使用案例建置 ETL 資料管道的需求。

此範例會建立從指定來源到目標的零 ETL 整合。

```
---
AWSTemplateFormatVersion: '2010-09-09'
# Sample CFN YAML to demonstrate creating a zero-ETL integration in AWS Glue
#
# Parameters section contains names that are substituted in the Resources section
# 
Parameters:                                                                                                       
  # The name of the zero-ETL integration to be created
  IntegrationName:  
    Type: String
  # The ARN for the source of the zero-ETL integration
  SourceArn:
    Type: String
  # The ARN for the target of the zero-ETL integration 
  TargetArn:
    Type: String
#
#
Resources:
# Create an AWS Glue zero-ETL integration
  GlueIntegration:
    Type: AWS::Glue::Integration
    Properties:
      IntegrationName: !Ref IntegrationName
      Description: "AWS Glue zero-ETL integration"
      SourceArn: !Ref SourceArn
      TargetArn: !Ref TargetArn
      DataFilter: "include:table1"
      Tags:
        - Key: Purpose
          Value: GlueZeroETLIntegration
```

## 與整合資源屬性AWS Glue進行零 ETL 整合的範例 CloudFormation 範本
<a name="sample-cfn-template-integration-resource-property"></a>

AWS Glue 零 ETL 整合需要為來源和目標定義資源屬性。對於來源，唯一需要定義的屬性是整合將用於存取AWS Glue連線或 DynamoDB 資料庫的 IAM 角色。對於目標，可以設定的屬性包括將用於存取目標的 IAM 角色、應建立整合的 VPC 網路、將用於設定整合事件通知的事件匯流排，以及將用於資料加密的 KMS 金鑰。

以下範例定義來源和目標資源屬性，然後建立從來源到目標的零 ETL 整合。

```
---
AWSTemplateFormatVersion: '2010-09-09'
# Sample CFN YAML to demonstrate defining the integration resource properties and then creating a zero-ETL integration in AWS Glue
#
# Parameters section contains names that are substituted in the Resources section
# 
Parameters:
  #The name of the zero-ETL integration to be created
  IntegrationName:
    Type: String
  # The ARN for the target of the zero-ETL integration
  TargetArn:
    Type: String
  # The ARN for the IAM role that will be used to access the target
  TargetRoleArn:
    Type: String
  # The ARN for the source of the zero-ETL integration
  SourceArn:
    Type: String
  # The ARN for the IAM role that will be used to access thesource
  SourceRoleArn:
    Type: String
#
#
Resources:
  # Integration Resource Property for zero-ETL target
  TargetIntegrationResourceProperty:
    Type: AWS::Glue::IntegrationResourceProperty
    Properties:
      ResourceArn: !Ref TargetArn
      TargetProcessingProperties:
        RoleArn: !Ref TargetRoleArn
      Tags:
        - Key: Purpose
          Value: TargetIrpTag

  # Integration Resource Property for zero-ETL source
  SourceIntegrationResourceProperty:
    Type: AWS::Glue::IntegrationResourceProperty
    Properties:
      ResourceArn: !Ref SourceArn
      SourceProcessingProperties:
        RoleArn: !Ref SourceRoleArn
      Tags:
        - Key: Purpose
          Value: SourceIRPTag

  # Create an AWS Glue zero-ETL integration
  GlueIntegration:
    Type: AWS::Glue::Integration
    Properties:
      IntegrationName: !Ref IntegrationName
      Description: "AWS Glue zero-ETL integration"
      SourceArn: !Ref SourceArn
      TargetArn: !Ref TargetArn
      DataFilter: "include:table1"
      Tags:
        - Key: Purpose
          Value: GlueZeroETLIntegration
```

## JDBC AWS Glue 爬蟲程式的範例 CloudFormation 範本
<a name="sample-cfn-template-crawler-jdbc"></a>

AWS Glue 爬蟲程式會在資料目錄中建立與資料對應的中繼資料資料表。接下來可使用這些資料表定義做為 ETL 任務的來源和目標。

此範例會在資料目錄中建立一個爬蟲程式、所需的 IAM 角色、AWS Glue 資料庫。在執行此爬蟲程式時，其會擔任 IAM 角色，並為儲存在某個 MySQL 資料庫內的航班資料所建的資料庫建立一份資料表。資料表建立時會附帶字首「`cfn_jdbc_1_`」。此範本所建立的 IAM 角色允許全域許可，您可能會想建立一個自訂角色。無法為 JDBC 資料定義自訂分類器。預設使用 AWS Glue 內建的分類器。

當您將此範例提交至 CloudFormation 主控台時，您必須確認是否要建立 IAM 角色。

```
---
AWSTemplateFormatVersion: '2010-09-09'
# Sample CFN YAML to demonstrate creating a crawler
#
# Parameters section contains names that are substituted in the Resources section
# These parameters are the names the resources created in the Data Catalog
Parameters:                                                                                                       
# The name of the crawler to be created
  CFNCrawlerName:  
    Type: String
    Default: cfn-crawler-jdbc-flights-1
# The name of the database to be created to contain tables	
  CFNDatabaseName:
    Type: String
    Default: cfn-database-jdbc-flights-1
# The prefix for all tables crawled and created	
  CFNTablePrefixName:
    Type: String
    Default: cfn_jdbc_1_
# The name of the existing connection to the MySQL database
  CFNConnectionName:  
    Type: String
    Default: cfn-connection-mysql-flights-1
# The name of the JDBC path (database/schema/table) with wildcard (%) to crawl	
  CFNJDBCPath:  
    Type: String
    Default: saldev/%		
#
#
# Resources section defines metadata for the Data Catalog
Resources:
#Create IAM Role assumed by the crawler. For demonstration, this role is given all permissions.
  CFNRoleFlights:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"		 	 	 
        Statement:
          -
            Effect: "Allow"
            Principal:
              Service:
                - "glue.amazonaws.com"
            Action:
              - "sts:AssumeRole"
      Path: "/"
      Policies:
        -
          PolicyName: "root"
          PolicyDocument:
            Version: "2012-10-17"		 	 	 
            Statement:
              -
                Effect: "Allow"
                Action: "*"
                Resource: "*"
 # Create a database to contain tables created by the crawler
  CFNDatabaseFlights:
    Type: AWS::Glue::Database
    Properties:
      CatalogId: !Ref AWS::AccountId
      DatabaseInput:
        Name: !Ref CFNDatabaseName
        Description: "AWS Glue container to hold metadata tables for the flights crawler"
 #Create a crawler to crawl the flights data in MySQL database
  CFNCrawlerFlights:
    Type: AWS::Glue::Crawler
    Properties:
      Name: !Ref CFNCrawlerName
      Role: !GetAtt CFNRoleFlights.Arn
      #Classifiers: none, use the default classifier
      Description: AWS Glue crawler to crawl flights data
      #Schedule: none, use default run-on-demand
      DatabaseName: !Ref CFNDatabaseName
      Targets:
        JdbcTargets:
          # JDBC MySQL database with the flights data
          - ConnectionName: !Ref CFNConnectionName
            Path: !Ref CFNJDBCPath
          #Exclusions: none
      TablePrefix: !Ref CFNTablePrefixName
      SchemaChangePolicy:
        UpdateBehavior: "UPDATE_IN_DATABASE"
        DeleteBehavior: "LOG"
	  Configuration: "{\"Version\":1.0,\"CrawlerOutput\":{\"Partitions\":{\"AddOrUpdateBehavior\":\"InheritFromTable\"},\"Tables\":{\"AddOrUpdateBehavior\":\"MergeNewColumns\"}}}"
```

## Amazon S3 至 Amazon S3 AWS Glue任務的範例 CloudFormation 範本
<a name="sample-cfn-template-job-s3"></a>

在資料目錄中的 AWS Glue 任務含有在 AWS Glue 中執行指令碼所需的參數值。

此範例會建立一項任務，用以讀取來自 Amazon S3 儲存貯體的航班資料 (格式為 `csv`)，並將之寫入 Amazon S3 Parquet 檔案。此任務所執行的此指令碼必須已先存在。可以使用 AWS Glue 主控台為您的環境產生 ETL 指令碼。執行工作時，也必須提供具有正確許可的 IAM 角色。

常見的參數值會出現在範本中。舉例而言，`AllocatedCapacity` (DPU) 預設值為 5。

```
---
AWSTemplateFormatVersion: '2010-09-09'
# Sample CFN YAML to demonstrate creating a job using the public flights S3 table in a public bucket
#
# Parameters section contains names that are substituted in the Resources section
# These parameters are the names the resources created in the Data Catalog
Parameters:                                                                                                       
# The name of the job to be created
  CFNJobName:  
    Type: String
    Default: cfn-job-S3-to-S3-2
# The name of the IAM role that the job assumes. It must have access to data, script, temporary directory
  CFNIAMRoleName:  
    Type: String
    Default: AWSGlueServiceRoleGA
# The S3 path where the script for this job is located
  CFNScriptLocation:  
    Type: String
    Default: s3://aws-glue-scripts-123456789012-us-east-1/myid/sal-job-test2	
#
#
# Resources section defines metadata for the Data Catalog
Resources:                                      
# Create job to run script which accesses flightscsv table and write to S3 file as parquet.
# The script already exists and is called by this job	
  CFNJobFlights:
    Type: AWS::Glue::Job   
    Properties:
      Role: !Ref CFNIAMRoleName  
      #DefaultArguments: JSON object 
      # If script written in Scala, then set DefaultArguments={'--job-language'; 'scala', '--class': 'your scala class'}
      #Connections:  No connection needed for S3 to S3 job 
      #  ConnectionsList  
      #MaxRetries: Double  
      Description: Job created with CloudFormation  
      #LogUri: String  
      Command:   
        Name: glueetl  
        ScriptLocation: !Ref CFNScriptLocation
             # for access to directories use proper IAM role with permission to buckets and folders that begin with "aws-glue-"					 
             # script uses temp directory from job definition if required (temp directory not used S3 to S3)
             # script defines target for output as s3://aws-glue-target/sal    			 
      AllocatedCapacity: 5  
      ExecutionProperty:   
        MaxConcurrentRuns: 1  
      Name: !Ref CFNJobName
```

## JDBC 到 Amazon S3 AWS Glue任務的範例 CloudFormation 範本
<a name="sample-cfn-template-job-jdbc"></a>

在資料目錄中的 AWS Glue 任務含有在 AWS Glue 中執行指令碼所需的參數值。

此範例會建立一項任務，如名為 `cfn-connection-mysql-flights-1` 的連線所定義，從 MySQL JDBC 資料庫讀取航班資料，並將資料寫入 Amazon S3 Parquet 檔案。此任務所執行的此指令碼必須已先存在。可以使用 AWS Glue 主控台為您的環境產生 ETL 指令碼。執行工作時，也必須提供具有正確許可的 IAM 角色。

常見的參數值會出現在範本中。舉例而言，`AllocatedCapacity` (DPU) 預設值為 5。

```
---
AWSTemplateFormatVersion: '2010-09-09'
# Sample CFN YAML to demonstrate creating a job using a MySQL JDBC DB with the flights data to an S3 file
#
# Parameters section contains names that are substituted in the Resources section
# These parameters are the names the resources created in the Data Catalog
Parameters:                                                                                                       
# The name of the job to be created
  CFNJobName:  
    Type: String
    Default: cfn-job-JDBC-to-S3-1
# The name of the IAM role that the job assumes. It must have access to data, script, temporary directory
  CFNIAMRoleName:  
    Type: String
    Default: AWSGlueServiceRoleGA
# The S3 path where the script for this job is located
  CFNScriptLocation:  
    Type: String
    Default: s3://aws-glue-scripts-123456789012-us-east-1/myid/sal-job-dec4a	
# The name of the connection used for JDBC data source
  CFNConnectionName:  
    Type: String
    Default: cfn-connection-mysql-flights-1
#
#
# Resources section defines metadata for the Data Catalog
Resources:                                      
# Create job to run script which accesses JDBC flights table via a connection and write to S3 file as parquet.
# The script already exists and is called by this job	
  CFNJobFlights:
    Type: AWS::Glue::Job   
    Properties:
      Role: !Ref CFNIAMRoleName  
      #DefaultArguments: JSON object  
      # For example, if required by script, set temporary directory as DefaultArguments={'--TempDir'; 's3://aws-glue-temporary-xyc/sal'}
      Connections:
        Connections:
        - !Ref CFNConnectionName 
      #MaxRetries: Double  
      Description: Job created with CloudFormation using existing script
      #LogUri: String  
      Command:   
        Name: glueetl  
        ScriptLocation: !Ref CFNScriptLocation
             # for access to directories use proper IAM role with permission to buckets and folders that begin with "aws-glue-"					 
             # if required, script defines temp directory as argument TempDir and used in script like redshift_tmp_dir = args["TempDir"] 
             # script defines target for output as s3://aws-glue-target/sal    			 
      AllocatedCapacity: 5  
      ExecutionProperty:   
        MaxConcurrentRuns: 1  
      Name: !Ref CFNJobName
```

## AWS Glue 隨需觸發的範例 CloudFormation 範本
<a name="sample-cfn-template-trigger-ondemand"></a>

資料目錄中的 AWS Glue 觸發條件含有必要的參數值，在觸發條件觸動而開始執行任務時會需要。啟用後，隨需觸發條件即會觸動。

此範例會建立一項隨需觸發條件，會開始進行名為 `cfn-job-S3-to-S3-1` 的任務。

```
---
AWSTemplateFormatVersion: '2010-09-09'
# Sample CFN YAML to demonstrate creating an on-demand trigger
#
# Parameters section contains names that are substituted in the Resources section
# These parameters are the names the resources created in the Data Catalog
Parameters:
  # The existing job to be started by this trigger 
  CFNJobName:
    Type: String
    Default: cfn-job-S3-to-S3-1
  # The name of the trigger to be created
  CFNTriggerName:
    Type: String
    Default: cfn-trigger-ondemand-flights-1	
#
# Resources section defines metadata for the Data Catalog
# Sample CFN YAML to demonstrate creating an on-demand trigger for a job	
Resources:                                      
# Create trigger to run an existing job (CFNJobName) on an on-demand schedule.	
  CFNTriggerSample:
    Type: AWS::Glue::Trigger   
    Properties:
      Name:
        Ref: CFNTriggerName		
      Description: Trigger created with CloudFormation
      Type: ON_DEMAND                                                        	   
      Actions:
        - JobName: !Ref CFNJobName                	  
        # Arguments: JSON object
      #Schedule: 
      #Predicate:
```

## AWS Glue 排程觸發的範例 CloudFormation 範本
<a name="sample-cfn-template-trigger-scheduled"></a>

資料目錄中的 AWS Glue 觸發條件含有必要的參數值，在觸發條件觸動而開始執行任務時會需要。排程觸發條件在啟用時即會觸發，並會跳出 cron 計時器。

此範例會建立一項排程觸發條件，會開始進行名為 `cfn-job-S3-to-S3-1` 的任務。計時器為 cron 表達式，在任務天每 10 分鐘就會執行一次任務。

```
---
AWSTemplateFormatVersion: '2010-09-09'
# Sample CFN YAML to demonstrate creating a scheduled trigger
#
# Parameters section contains names that are substituted in the Resources section
# These parameters are the names the resources created in the Data Catalog
Parameters:
  # The existing job to be started by this trigger 
  CFNJobName:
    Type: String
    Default: cfn-job-S3-to-S3-1
  # The name of the trigger to be created
  CFNTriggerName:
    Type: String
    Default: cfn-trigger-scheduled-flights-1	
#
# Resources section defines metadata for the Data Catalog
# Sample CFN YAML to demonstrate creating a scheduled trigger for a job
#	
Resources:                                      
# Create trigger to run an existing job (CFNJobName) on a cron schedule.	
  TriggerSample1CFN:
    Type: AWS::Glue::Trigger   
    Properties:
      Name:
        Ref: CFNTriggerName		
      Description: Trigger created with CloudFormation
      Type: SCHEDULED                                                        	   
      Actions:
        - JobName: !Ref CFNJobName                	  
        # Arguments: JSON object
      # # Run the trigger every 10 minutes on Monday to Friday 		
      Schedule: cron(0/10 * ? * MON-FRI *) 
      #Predicate:
```

## AWS Glue 條件式觸發的範例 CloudFormation 範本
<a name="sample-cfn-template-trigger-conditional"></a>

資料目錄中的 AWS Glue 觸發條件含有必要的參數值，在觸發條件觸動而開始執行任務時會需要。條件式觸發條件會在啟用時觸發，例如任務成功完成。

此範例會建立一項條件式觸發條件，會開始進行名為 `cfn-job-S3-to-S3-1` 的任務。此任務會在名為 `cfn-job-S3-to-S3-2 ` 的任務成功完成後發動。

```
---
AWSTemplateFormatVersion: '2010-09-09'
# Sample CFN YAML to demonstrate creating a conditional trigger for a job, which starts when another job completes
#
# Parameters section contains names that are substituted in the Resources section
# These parameters are the names the resources created in the Data Catalog
Parameters:
  # The existing job to be started by this trigger 
  CFNJobName:
    Type: String
    Default: cfn-job-S3-to-S3-1
  # The existing job that when it finishes causes trigger to fire
  CFNJobName2:
    Type: String
    Default: cfn-job-S3-to-S3-2	
  # The name of the trigger to be created
  CFNTriggerName:
    Type: String
    Default: cfn-trigger-conditional-1	
#	
Resources:                                      
# Create trigger to run an existing job (CFNJobName) when another job completes (CFNJobName2).	
  CFNTriggerSample:
    Type: AWS::Glue::Trigger   
    Properties:
      Name:
        Ref: CFNTriggerName		
      Description: Trigger created with CloudFormation
      Type: CONDITIONAL                                                        	   
      Actions:
        - JobName: !Ref CFNJobName                	  
        # Arguments: JSON object
      #Schedule: none 
      Predicate:
        #Value for Logical is required if more than 1 job listed in Conditions	  
        Logical: AND
        Conditions:
          - LogicalOperator: EQUALS	
            JobName: !Ref CFNJobName2
            State: SUCCEEDED
```

## AWS Glue 開發端點的範例 CloudFormation 範本
<a name="sample-cfn-template-machine-learning-transform"></a>

AWS Glue 機器學習轉換是一種自訂轉換，可清理您的資料。目前有一個名為 FindMatches 的可用轉換。FindMatches 轉換可讓您識別資料集中重複或相符的記錄，即使記錄沒有通用的唯一識別符，也沒有欄位完全相符。

此範例會建立機器學習轉換。如需有關建立機器學習轉換所需參數的詳細資訊，請參閱 [記錄與 AWS Lake Formation FindMatches 相符](machine-learning.md)。

```
---
AWSTemplateFormatVersion: '2010-09-09'
# Sample CFN YAML to demonstrate creating a machine learning transform
#
# Resources section defines metadata for the machine learning transform
Resources:
  MyMLTransform:
    Type: "AWS::Glue::MLTransform"
    Condition: "isGlueMLGARegion"
    Properties:
      Name: !Sub "MyTransform"
      Description: "The bestest transform ever"
      Role: !ImportValue MyMLTransformUserRole
      GlueVersion: "1.0"
      WorkerType: "Standard"
      NumberOfWorkers: 5
      Timeout: 120
      MaxRetries: 1
      InputRecordTables:
        GlueTables:
          - DatabaseName: !ImportValue MyMLTransformDatabase
            TableName: !ImportValue MyMLTransformTable
      TransformParameters:
        TransformType: "FIND_MATCHES"
        FindMatchesParameters:
          PrimaryKeyColumnName: "testcolumn"
          PrecisionRecallTradeoff: 0.5
          AccuracyCostTradeoff: 0.5
          EnforceProvidedLabels: True
      Tags:
        key1: "value1"
        key2: "value2"
      TransformEncryption:
        TaskRunSecurityConfigurationName: !ImportValue MyMLTransformSecurityConfiguration
        MLUserDataEncryption:
          MLUserDataEncryptionMode: "SSE-KMS"
          KmsKeyId: !ImportValue MyMLTransformEncryptionKey
```

## AWS Glue Data Quality 規則集的範例 CloudFormation 範本
<a name="sample-cfn-template-data-quality-ruleset"></a>

 AWS Glue Data Quality 規則集包含可在 Data Catalog 內的資料表上評估的規則。將規則集放置在目標資料表上後，您便可以進入資料目錄並執行評估，該評估會根據規則集中的這些規則執行資料。從評估資料列計數到評估資料的參照完整性，這些規則可能有所不同。

下列範例是 CloudFormation 範本，可在指定的目標資料表上建立包含各種規則的規則集。

```
AWSTemplateFormatVersion: '2010-09-09'
# Sample CFN YAML to demonstrate creating a DataQualityRuleset
#
# Parameters section contains names that are substituted in the Resources section
# These parameters are the names the resources created in the Data Catalog
Parameters:                                                                                                       
  # The name of the ruleset to be created
  RulesetName:  
    Type: String
    Default: "CFNRulesetName"
  RulesetDescription:  
    Type: String
    Default: "CFN DataQualityRuleset"
  # Rules that will be associated with this ruleset
  Rules:  
    Type: String
    Default: 'Rules = [
        RowCount > 100,
        IsUnique "id",
        IsComplete "nametype"
        ]'
  # Name of database and table within Data Catalog which the ruleset will 
  # be applied too
  DatabaseName:  
    Type: String
    Default: "ExampleDatabaseName"
  TableName:  
    Type: String
    Default: "ExampleTableName"

# Resources section defines metadata for the Data Catalog
Resources:
  # Creates a Data Quality ruleset under specified rules 
  DQRuleset:
    Type: AWS::Glue::DataQualityRuleset
    Properties:
      Name: !Ref RulesetName
      Description: !Ref RulesetDescription
      # The String within rules must be formatted in DQDL, a language 
      # used specifically to make rules
      Ruleset: !Ref Rules
      # The targeted table must exist within Data Catalog alongside 
      # the correct database
      TargetTable:
        DatabaseName: !Ref DatabaseName
        TableName: !Ref TableName
```

## 使用 EventBridge 排程器的AWS Glue Data Quality規則集範例 CloudFormation 範本
<a name="sample-cfn-template-data-quality-ruleset-eventbridge"></a>

 AWS Glue Data Quality 規則集包含可在 Data Catalog 內的資料表上評估的規則。將規則集放置在目標資料表上後，您便可以進入資料目錄並執行評估，該評估會根據規則集中的這些規則執行資料。您也可以在 CloudFormation 範本中新增 EventBridge 排程器，以便在定時間隔為您排程這些規則集評估，而不必手動進入資料目錄來評估規則集。

下列範例是 CloudFormation 範本，可建立資料品質規則集和 EventBridge 排程器，每五分鐘即評估上述規則集一次。

```
AWSTemplateFormatVersion: '2010-09-09'
# Sample CFN YAML to demonstrate creating a DataQualityRuleset
#
# Parameters section contains names that are substituted in the Resources section
# These parameters are the names the resources created in the Data Catalog
Parameters:                                                                                                       
  # The name of the ruleset to be created
  RulesetName:  
    Type: String
    Default: "CFNRulesetName"
  # Rules that will be associated with this Ruleset
  Rules:  
    Type: String
    Default: 'Rules = [
        RowCount > 100,
        IsUnique "id",
        IsComplete "nametype"
        ]'
  # The name of the Schedule to be created  
  ScheduleName:  
    Type: String
    Default: "ScheduleDQRulsetEvaluation"
  # This expression determines the rate at which the Schedule will evaluate
  # your data using the above ruleset
  ScheduleRate:
    Type: String
    Default: "rate(5 minutes)"
  # The Request that being sent must match the details of the Data Quality Ruleset
  ScheduleRequest:
    Type: String
    Default: '
        { "DataSource": { "GlueTable": { "DatabaseName": "ExampleDatabaseName",
         "TableName": "ExampleTableName" } },
         "Role": "role/AWSGlueServiceRoleDefault",
          "RulesetNames": [ ""CFNRulesetName"" ] }
        '

# Resources section defines metadata for the Data Catalog
Resources:
  # Creates a Data Quality ruleset under specified rules 
  DQRuleset:
    Type: AWS::Glue::DataQualityRuleset
    Properties:
      Name: !Ref RulesetName
      Description: "CFN DataQualityRuleset"
      # The String within rules must be formatted in DQDL, a language 
      # used specifically to make rules
      Ruleset: !Ref Rules
      # The targeted table must exist within Data Catalog alongside 
      # the correct database
      TargetTable:
        DatabaseName: "ExampleDatabaseName"
        TableName: "ExampleTableName"
  # Create a Scheduler to schedule evaluation runs on the above ruleset
  ScheduleDQEval:
    Type: AWS::Scheduler::Schedule
    Properties: 
      Name: !Ref ScheduleName
      Description: "Schedule DataQualityRuleset Evaluations"
      FlexibleTimeWindow: 
        Mode: "OFF"
      ScheduleExpression: !Ref ScheduleRate
      ScheduleExpressionTimezone: "America/New_York"
      State: "ENABLED"
      Target: 
        # The ARN is the API that will be run, since we want to evaluate our ruleset
        # we want this specific ARN
        Arn: "arn:aws:scheduler:::aws-sdk:glue:startDataQualityRulesetEvaluationRun"
        # Your RoleArn must have approval to schedule
        RoleArn: "arn:aws:iam::123456789012:role/AWSGlueServiceRoleDefault"
        # This is the Request that is being sent to the Arn
        Input: '
        { "DataSource": { "GlueTable": { "DatabaseName": "sampledb", "TableName": "meteorite" } },
         "Role": "role/AWSGlueServiceRoleDefault",
          "RulesetNames": [ "TestCFN" ] }
        '
```

## AWS Glue 開發端點的範例 CloudFormation 範本
<a name="sample-cfn-template-devendpoint"></a>

AWS Glue 開發端點是一種環境，可讓您用於開發並測試 AWS Glue 指令碼。

次範例會建立一個開發端點，僅使用可成功建立端點的最低限量參數值。如需開發端點設定所需的參數的詳細資訊，請參閱 [專為 AWS Glue 的開發設定聯網](start-development-endpoint.md)。

您要提供現有的 IAM 角色 ARN (Amazon Resource Name) 以建立開發端點。若打算在開發端點上建立筆記型電腦伺服器，請提供有效的 RSA 公有金鑰，並將對應的私有金鑰保持在可用狀態。

**注意**  
只要是您建立並與開發端點關聯的筆記本伺服器，您就可以管理。因此，如果您刪除開發端點，若要刪除筆記本伺服器，則必須刪除 CloudFormation 主控台上的 CloudFormation 堆疊。

```
---
AWSTemplateFormatVersion: '2010-09-09'
# Sample CFN YAML to demonstrate creating a development endpoint
#
# Parameters section contains names that are substituted in the Resources section
# These parameters are the names the resources created in the Data Catalog
Parameters:                                                                                                       
# The name of the crawler to be created
  CFNEndpointName:  
    Type: String
    Default: cfn-devendpoint-1
  CFNIAMRoleArn:
    Type: String
    Default: arn:aws:iam::123456789012/role/AWSGlueServiceRoleGA	
#
#
# Resources section defines metadata for the Data Catalog
Resources:
  CFNDevEndpoint:
    Type: AWS::Glue::DevEndpoint
    Properties:
      EndpointName: !Ref CFNEndpointName
      #ExtraJarsS3Path: String
      #ExtraPythonLibsS3Path: String
      NumberOfNodes: 5
      PublicKey: ssh-rsa public.....key myuserid-key
      RoleArn: !Ref CFNIAMRoleArn
      SecurityGroupIds: 
        - sg-64986c0b
      SubnetId: subnet-c67cccac
```