

本文為英文版的機器翻譯版本，如內容有任何歧義或不一致之處，概以英文版為準。

# 根據 Amazon EMR 日誌建立和查詢分割的資料表
<a name="emr-create-table-partitioned"></a>

這些範例使用相同的日誌位置來建立 Athena 資料表，但資料表會進行分割，然後為每個日誌位置建立分割區。如需詳細資訊，請參閱[分割您的資料](partitions.md)。

以下查詢會建立名為 `mypartitionedemrlogs` 的分割資料表：

```
CREATE EXTERNAL TABLE `mypartitionedemrlogs`(
  `data` string COMMENT 'from deserializer')
 partitioned by (logtype string)
ROW FORMAT DELIMITED  
FIELDS TERMINATED BY '|'
LINES TERMINATED BY '\n'
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 's3://aws-logs-123456789012-us-west-2/elasticmapreduce/j-2ABCDE34F5GH6'
```

以下查詢陳述式接著會根據 Amazon EMR 在 Amazon S3 中所建立之不同日誌類型的子目錄，建立資料表分割區：

```
ALTER TABLE mypartitionedemrlogs ADD
     PARTITION (logtype='containers')
     LOCATION 's3://aws-logs-123456789012-us-west-2/elasticmapreduce/j-2ABCDE34F5GH6/containers/'
```

```
ALTER TABLE mypartitionedemrlogs ADD
     PARTITION (logtype='hadoop-mapreduce')
     LOCATION 's3://aws-logs-123456789012-us-west-2/elasticmapreduce/j-2ABCDE34F5GH6/hadoop-mapreduce/'
```

```
ALTER TABLE mypartitionedemrlogs ADD
     PARTITION (logtype='hadoop-state-pusher')
     LOCATION 's3://aws-logs-123456789012-us-west-2/elasticmapreduce/j-2ABCDE34F5GH6/hadoop-state-pusher/'
```

```
ALTER TABLE mypartitionedemrlogs ADD
     PARTITION (logtype='node')
     LOCATION 's3://aws-logs-123456789012-us-west-2/elasticmapreduce/j-2ABCDE34F5GH6/node/'
```

```
ALTER TABLE mypartitionedemrlogs ADD
     PARTITION (logtype='steps')
     LOCATION 's3://aws-logs-123456789012-us-west-2/elasticmapreduce/j-2ABCDE34F5GH6/steps/'
```

建立分割區後，您可以在資料表上執行 `SHOW PARTITIONS` 查詢，以便確認：

```
SHOW PARTITIONS mypartitionedemrlogs;
```

## 查詢範例
<a name="emr-example-queries-partitioned"></a>

以下範例示範特定日誌項目的查詢會使用上述範例所建立的資料表和分割區。

**Example – 查詢 ERROR 或 WARN 的容器分割區中的應用程式 application\$11561661818238\$10002 日誌**  

```
SELECT data,
        "$PATH"
FROM "default"."mypartitionedemrlogs"
WHERE logtype='containers'
        AND regexp_like("$PATH",'application_1561661818238_0002')
        AND regexp_like(data, 'ERROR|WARN') limit 100;
```

**Example – 查詢任務 job\$11561661818238\$10004 的 hadoop-Mapreduce 分割區和失敗的 Reduce 次數**  

```
SELECT data,
        "$PATH"
FROM "default"."mypartitionedemrlogs"
WHERE logtype='hadoop-mapreduce'
        AND regexp_like(data,'job_1561661818238_0004|Failed Reduces') limit 100;
```

**Example – 在查詢 ID 056e0609-33e1-4611-956c-7a31b42d2663 的節點分割區中查詢 Hive 日誌**  

```
SELECT data,
        "$PATH"
FROM "default"."mypartitionedemrlogs"
WHERE logtype='node'
        AND regexp_like("$PATH",'hive')
        AND regexp_like(data,'056e0609-33e1-4611-956c-7a31b42d2663') limit 100;
```

**Example – 在應用程式 1567660019320\$10001\$101\$1000001 的節點分割區中查詢 resourcemanager 日誌**  

```
SELECT data,
        "$PATH"
FROM "default"."mypartitionedemrlogs"
WHERE logtype='node'
        AND regexp_like(data,'resourcemanager')
        AND regexp_like(data,'1567660019320_0001_01_000001') limit 100
```