

本文為英文版的機器翻譯版本，如內容有任何歧義或不一致之處，概以英文版為準。

# 從獨立 Spark 應用程式連線至 Data Catalog
<a name="connect-gludc-spark"></a>

您可以使用 Apache Iceberg 連接器從獨立應用程式連線至 Data Catalog。

1. 為 Spark 應用程式建立 IAM 角色。

1. 使用 AWS Glue Iceberg 連接器連接到 Iceberg Rest 端點。

   ```
   # configure your application. Refer to https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html for best practices on configuring environment variables.
   export AWS_ACCESS_KEY_ID=$(aws configure get appUser.aws_access_key_id)
   export AWS_SECRET_ACCESS_KEY=$(aws configure get appUser.aws_secret_access_key)
   export AWS_SESSION_TOKEN=$(aws configure get appUser.aws_secret_token)
   
   export AWS_REGION=us-east-1
   export REGION=us-east-1
   export AWS_ACCOUNT_ID = {specify your aws account id here}
   
   ~/spark-3.5.3-bin-hadoop3/bin/spark-shell \
       --packages org.apache.iceberg:iceberg-spark-runtime-3.4_2.12:1.6.0 \
       --conf "spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions" \
       --conf "spark.sql.defaultCatalog=spark_catalog" \
       --conf "spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkCatalog" \
       --conf "spark.sql.catalog.spark_catalog.type=rest" \
       --conf "spark.sql.catalog.spark_catalog.uri=https://glue.us-east-1.amazonaws.com/iceberg" \
       --conf "spark.sql.catalog.spark_catalog.warehouse = {AWS_ACCOUNT_ID}" \
       --conf "spark.sql.catalog.spark_catalog.rest.sigv4-enabled=true" \
       --conf "spark.sql.catalog.spark_catalog.rest.signing-name=glue" \
       --conf "spark.sql.catalog.spark_catalog.rest.signing-region=us-east-1" \
       --conf "spark.sql.catalog.spark_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO" \
       --conf "spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.SimpleAWSCredentialProvider"
   ```

1. 查詢 Data Catalog 中的資料。

   ```
   spark.sql("create database myicebergdb").show()
   spark.sql("""CREATE TABLE myicebergdb.mytbl (name string) USING iceberg location 's3://{{bucket_name}}/{{mytbl}}'""")
   spark.sql("insert into myicebergdb.mytbl values('demo') ").show()
   ```