# Accessing the Data Catalog
<a name="access_catalog"></a>

 You can use the AWS Glue Data Catalog (Data Catalog) to discover and understand your data. Data Catalog provides a consistent way to maintain schema definitions, data types, locations, and other metadata. You can access the Data Catalog using the following methods:
+ AWS Glue console – You can access and manage the Data Catalog through the AWS Glue console, a web-based user interface. The console allows you to browse and search for databases, tables, and their associated metadata, as well as create, update, and delete metadata definitions. 
+ AWS Glue crawler – Crawlers are programs that automatically scan your data sources and populate the Data Catalog with metadata. You can create and run crawlers to discover and catalog data from various sources like Amazon S3, Amazon RDS, Amazon DynamoDB, Amazon CloudWatch, and JDBC-compliant relational databases such as MySQL, and PostgreSQL as well as several non-AWS sources such as Snowflake and Google BigQuery.
+ AWS Glue APIs – You can access the Data Catalog programmatically using the AWS Glue APIs. These APIs allow you to interact with the Data Catalog programmatically, enabling automation and integration with other applications and services. 
+ AWS Command Line Interface (AWS CLI) – You can use the AWS CLI to access and manage the Data Catalog from the command line. The CLI provides commands for creating, updating, and deleting metadata definitions, as well as querying and retrieving metadata information. 
+ Integration with other AWS services – The Data Catalog integrates with various other AWS services, allowing you to access and utilize the metadata stored in the catalog. For example, you can use Amazon Athena to query data sources using the metadata in the Data Catalog, and use AWS Lake Formation to manage data access and governance for the Data Catalog resources. 

**Topics**
+ [Connecting to the Data Catalog using AWS Glue Iceberg REST endpoint](connect-glu-iceberg-rest.md)
+ [Connecting to the Data Catalog using AWS Glue Iceberg REST extension endpoint](connect-glue-iceberg-rest-ext.md)
+ [AWS Glue REST APIs for Apache Iceberg specifications](iceberg-rest-apis.md)
+ [Connecting to Data Catalog from a standalone Spark application](connect-gludc-spark.md)
+ [Data mapping between Amazon Redshift and Apache Iceberg](data-mapping-rs-iceberg.md)
+ [Considerations and limitations when using AWS Glue Iceberg REST Catalog APIs](limitation-glue-iceberg-rest-api.md)

# Connecting to the Data Catalog using AWS Glue Iceberg REST endpoint
<a name="connect-glu-iceberg-rest"></a>

 AWS Glue's Iceberg REST endpoint supports API operations specified in the Apache Iceberg REST specification. Using an Iceberg REST client, you can connect your application running on an analytics engine to the REST catalog hosted in the Data Catalog.

 The endpoint supports both Apache Iceberg table specifications - v1 and v2, defaulting to v2. When using the Iceberg table v1 specification, you must specify v1 in the API call. Using the API operation, you can access Iceberg tables stored in both Amazon S3 object storage and Amazon S3 Table storage. 

**Endpoint configuration**

You can access the AWS Glue Iceberg REST catalog using the service endpoint. Refer to the [AWS Glue service endpoints reference guide](https://docs.aws.amazon.com/general/latest/gr/glue.html#glue_region) for the region-specific endpoint. For example, when connecting to AWS Glue in the us-east-1 Region, you need to configure the endpoint URI property as follows: 

```
Endpoint : https://glue.us-east-1.amazonaws.com/iceberg
```

** Additional configuration properties** – When using Iceberg client to connect an analytics engine like Spark to the service endpoint, you are required to specify the following application configuration properties:

```
catalog_name = "mydatacatalog"
aws_account_id = "123456789012"
aws_region = "us-east-1"
spark = SparkSession.builder \
    ... \
    .config("spark.sql.defaultCatalog", catalog_name) \
    .config(f"spark.sql.catalog.{catalog_name}", "org.apache.iceberg.spark.SparkCatalog") \
    .config(f"spark.sql.catalog.{catalog_name}.type", "rest") \
    .config(f"spark.sql.catalog.{catalog_name}.uri", "https://glue.{aws_region}.amazonaws.com/iceberg") \
    .config(f"spark.sql.catalog.{catalog_name}.warehouse", "{aws_account_id}") \
    .config(f"spark.sql.catalog.{catalog_name}.rest.sigv4-enabled", "true") \
    .config(f"spark.sql.catalog.{catalog_name}.rest.signing-name", "glue") \    
    .config("spark.sql.extensions","org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions") \
    .getOrCreate()
```

AWS Glue Iceberg endpoint ` https://glue.us-east-1.amazonaws.com/iceberg` supports supports the following Iceberg REST APIs:
+ GetConfig
+ ListNamespaces
+ CreateNamespace
+ LoadNamespaceMetadata
+ UpdateNamespaceProperties
+ DeleteNamespace
+ ListTables
+ CreateTable
+ LoadTable
+ TableExists
+ UpdateTable
+ DeleteTable

## Prefix and catalog path parameters
<a name="prefix-catalog-path-parameters"></a>

Iceberg REST catalog APIs have a free-form prefix in their request URLs. For example, the `ListNamespaces` API call uses the `GET/v1/{prefix}/namespaces` URL format. AWS Glue prefix always follows the `/catalogs/{catalog}` structure to ensure that the REST path aligns the AWS Glue multi-catalog hierarchy. The `{catalog}` path parameter can be derived based on the following rules:


| **Access pattern** |  **Glue catalog ID Style**  |  **Prefix Style**  | **Example default catalog ID** |  **Example REST route**  | 
| --- | --- | --- | --- | --- | 
|  Access the default catalog in current account  | not required | : |  not applicable  |  GET /v1/catalogs/:/namespaces  | 
|  Access the default catalog in a specific account  | accountID | accountID | 111122223333 | GET /v1/catalogs/111122223333/namespaces | 
|  Access a nested catalog in current account  |  catalog1/catalog2  |  catalog1/catalog2  |  rmscatalog1:db1  |  GET /v1/catalogs/rmscatalog1:db1/namespaces  | 
|  Access a nested catalog in a specific account  |  accountId:catalog1/catalog2  |  accountId:catalog1/catalog2  |  123456789012/rmscatalog1:db1  |  GET /v1/catalogs/123456789012:rmscatalog1:db1/namespaces  | 

This catalog ID to prefix mapping is required only when you directly call the REST APIs. When you are working with the AWS Glue Iceberg REST catalog APIs through an engine, you need to specify the AWS Glue catalog ID in the `warehouse` parameter for your Iceberg REST catalog API setting or in the `glue.id` parameter for your AWS Glue extensions API setting. For example, see how you can use it with EMR Spark in [Use an Iceberg cluster with Spark](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-iceberg-use-spark-cluster.html).

## Namespace path parameter
<a name="ns-path-param"></a>

Namespaces in Iceberg REST catalog APIs path can have multiple levels. However, AWS Glue only supports single-level namespaces. To access a namespace in a multi-level catalog hierarchy, you can connect to a multi-level catalog above the namespace to reference the namespace. This allows any query engine that supports the 3-part notation of `catalog.namespace.table` to access objects in AWS Glue’s multi-level catalog hierarchy without compatibility issues compared to using the multi-level namespace.

# Connecting to the Data Catalog using AWS Glue Iceberg REST extension endpoint
<a name="connect-glue-iceberg-rest-ext"></a>

 AWS Glue Iceberg REST extension endpoint provides additional APIs, which are not present in the Apache Iceberg REST specification, and provides server-side scan planning capabilities. These additional APIs are used when you access tables stored in Amazon Redshift managed storage. The endpoint is accessible from an application using Apache Iceberg AWS Glue Data Catalog extensions. 

**Endpoint configuration** – A catalog with tables in the Redshift managed storage is accessible using the service endpoint. Refer to the [AWS Glue service endpoints reference guide](https://docs.aws.amazon.com/general/latest/gr/glue.html#glue_region) for the region-specific endpoint. For example, when connecting to AWS Glue in the us-east-1 Region, you need to configure the endpoint URI property as follows:

```
Endpoint : https://glue.us-east-1.amazonaws.com/extensions
```

```
catalog_name = "myredshiftcatalog"
aws_account_id = "123456789012"
aws_region = "us-east-1"
spark = SparkSession.builder \
    .config("spark.sql.defaultCatalog", catalog_name) \
    .config(f"spark.sql.catalog.{catalog_name}", "org.apache.iceberg.spark.SparkCatalog") \
    .config(f"spark.sql.catalog.{catalog_name}.type", "glue") \
    .config(f"spark.sql.catalog.{catalog_name}.glue.id", "{123456789012}:redshiftnamespacecatalog/redshiftdb") \
    .config("spark.sql.extensions","org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions") \
    .getOrCreate()
```

# AWS Glue REST APIs for Apache Iceberg specifications
<a name="iceberg-rest-apis"></a>

This section contains specifications about the AWS Glue Iceberg REST catalog and AWS Glue extension APIs, and considerations when using these APIs. 

API requests to the AWS Glue Data Catalog endpoints are authenticated using AWS Signature Version 4 (SigV4). See [AWS Signature Version 4 for API requests](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_sigv.html) section to learn more about AWS SigV4.

When accessing the AWS Glue service endpoint, and AWS Glue metadata, the application assumes an IAM role which requires `glue:getCatalog` IAM action. 

Access to the Data Catalog, and its objects can be managed using IAM, Lake Formation, or Lake Formation hybrid mode permissions.

Federated catalogs in the Data Catalog have Lake Formation registered data locations. Lake Formation works with the Data Catalog to provide database-style permissions to manage user access to Data Catalog objects. 

You can use IAM, AWS Lake Formation, or Lake Formation hybrid mode permissions to manage access to the default Data Catalog and its objects. 

To create, insert, or delete data in Lake Formation managed objects, you must set up specific permissions for the IAM user or role. 
+ CREATE\$1CATALOG – Required to create catalogs 
+ CREATE\$1DATABASE – Required to create databases
+ CREATE\$1TABLE – Required to create tables
+ DELETE – Required to delete data from a table
+ DESCRIBE – Required to read metadata 
+ DROP – Required to drop/delete a table or database
+ INSERT – Needed when the principal needs to insert data into a table
+ SELECT – Needed when the principal needs to select data from a table

For more information, see [Lake Formation permissions reference](https://docs.aws.amazon.com/lake-formation/latest/dg/lf-permissions-reference.html) in the AWS Lake Formation Developer Guide.

# GetConfig
<a name="get-config"></a>


**General information**  

|  |  | 
| --- |--- |
| Operation name | GetConfig | 
| Type |  Iceberg REST Catalog API  | 
| REST path |  GET /iceberg/v1/config  | 
| IAM action |  glue:GetCatalog  | 
| Lake Formation permissions | Not applicable | 
| CloudTrail event |  glue:GetCatalog  | 
| Open API definition | https://github.com/apache/iceberg/blob/apache-iceberg-1.6.1/open-api/rest-catalog-open-api.yaml\$1L67 | 

****Considerations and limitations****
+ The `warehouse` query parameter must be set to the AWS Glue catalog ID. If not set, the root catalog in the current account is used to return the response. For more information, see [Prefix and catalog path parameters](connect-glu-iceberg-rest.md#prefix-catalog-path-parameters).

# GetCatalog
<a name="get-catalog"></a>


**General information**  

|  |  | 
| --- |--- |
| Operation name | GetCatalog | 
| Type |  AWS Glue extension API  | 
| REST path |  GET/extensions/v1/catalogs/\$1catalog\$1  | 
| IAM action |  glue:GetCatalog  | 
| Lake Formation permissions | DESCRIBE | 
| CloudTrail event |  glue:GetCatalog  | 
| Open API definition | https://github.com/awslabs/glue-extensions-for-iceberg/blob/main/glue-extensions-api.yaml\$1L40 | 

****Considerations and limitations****
+ The catalog path parameter must follow the style described in the [Prefix and catalog path parameters](connect-glu-iceberg-rest.md#prefix-catalog-path-parameters) section.

# ListNamespaces
<a name="list-ns"></a>


**General information**  

|  |  | 
| --- |--- |
| Operation name | ListNamespaces | 
| Type |  Iceberg REST Catalog API  | 
| REST path |  GET/iceberg/v1/catalogs/\$1catalog\$1/namespaces  | 
| IAM action |  glue:GetDatabase  | 
| Lake Formation permissions | ALL, DESCRIBE, SELECT | 
| CloudTrail event |  glue:GetDatabase  | 
| Open API definition | https://github.com/apache/iceberg/blob/apache-iceberg-1.6.1/open-api/rest-catalog-open-api.yaml\$1L205 | 

****Considerations and limitations****
+ The catalog path parameter must follow the style described in the [Prefix and catalog path parameters](connect-glu-iceberg-rest.md#prefix-catalog-path-parameters) section.
+ Only namespaces of the next level is displayed. To list namespaces in deeper levels, specify the nested catalog ID in the catalog path parameter.

# CreateNamespace
<a name="create-ns"></a>


**General information**  

|  |  | 
| --- |--- |
| Operation name | CreateNamespace | 
| Type |  Iceberg REST Catalog API  | 
| REST path |  POST/iceberg/v1/catalogs/\$1catalog\$1/namespaces  | 
| IAM action |  glue:CreateDatabase  | 
| Lake Formation permissions | ALL, DESCRIBE, SELECT | 
| CloudTrail event |  glue:CreateDatabase  | 
| Open API definition | https://github.com/apache/iceberg/blob/apache-iceberg-1.6.1/open-api/rest-catalog-open-api.yaml\$1L256 | 

****Considerations and limitations****
+ The catalog path parameter must follow the style described in the [Prefix and catalog path parameters](connect-glu-iceberg-rest.md#prefix-catalog-path-parameters) section.
+ Only single level namespace can be created. To create a multi-level namespace, you must iteratively create each level, and connect to the level using the catalog path parameter.

# StartCreateNamespaceTransaction
<a name="start-create-ns-transaction"></a>


**General information**  

|  |  | 
| --- |--- |
| Operation name | StartCreateNamespaceTransaction | 
| Type |  AWS Glue extensions API  | 
| REST path |  POST/extensions/v1/catalogs/\$1catalog\$1/namespaces  | 
| IAM action |  glue:CreateDatabase  | 
| Lake Formation permissions | ALL, DESCRIBE, SELECT | 
| CloudTrail event |  glue:CreateDatabase  | 
| Open API definition | https://github.com/apache/iceberg/blob/apache-iceberg-1.6.1/open-api/rest-catalog-open-api.yaml\$1L256 | 

****Considerations and limitations****
+ The catalog path parameter must follow the style described in the [Prefix and catalog path parameters](connect-glu-iceberg-rest.md#prefix-catalog-path-parameters) section.
+ You can create only single-level namespace. To create a multi-level namespaces, you must iteratively create each level, and connect to the level using the catalog path parameter.
+ The API is asynchronous, and returns a transaction ID that that you can use for tracking using the `CheckTransactionStatus` API call.
+  You can call this API, only if the `GetCatalog` API call contains the parameter `use-extensions=true` in the response. 

## LoadNamespaceMetadata
<a name="load-ns-metadata"></a>


**General information**  

|  |  | 
| --- |--- |
| Operation name | LoadNamespaceMetadata | 
| Type |  Iceberg REST Catalog API  | 
| REST path |  GET/iceberg/v1/catalogs/\$1catalog\$1/namespaces/\$1ns\$1  | 
| IAM action |  glue:GetDatabase  | 
| Lake Formation permissions | ALL, DESCRIBE, SELECT | 
| CloudTrail event |  glue:GetDatabase  | 
| Open API definition | https://github.com/apache/iceberg/blob/apache-iceberg-1.6.1/open-api/rest-catalog-open-api.yaml\$1L302 | 

****Considerations and limitations****
+ The catalog path parameter must follow the style described in the [Prefix and catalog path parameters](connect-glu-iceberg-rest.md#prefix-catalog-path-parameters) section.
+ You can specify only a single-level namespace in the REST path parameter. For more in formation, see the [Namespace path parameter](connect-glu-iceberg-rest.md#ns-path-param) section.

## UpdateNamespaceProperties
<a name="w2aac20c29c16c21c13"></a>


**General information**  

|  |  | 
| --- |--- |
| Operation name | UpdateNamespaceProperties | 
| Type |  Iceberg REST Catalog API  | 
| REST path |  POST /iceberg/v1/catalogs/\$1catalog\$1/namespaces/\$1ns\$1/properties  | 
| IAM action |  glue:UpdateDatabase  | 
| Lake Formation permissions | ALL, ALTER | 
| CloudTrail event |  glue:UpdateDatabase  | 
| Open API definition | https://github.com/apache/iceberg/blob/apache-iceberg-1.6.1/open-api/rest-catalog-open-api.yaml\$1L400 | 

****Considerations and limitations****
+ The catalog path parameter must follow the style described in the [Prefix and catalog path parameters](connect-glu-iceberg-rest.md#prefix-catalog-path-parameters) section.
+ You can specify only a single-level namespace in the REST path parameter. For more in formation, see the [Namespace path parameter](connect-glu-iceberg-rest.md#ns-path-param) section.

# DeleteNamespace
<a name="delete-ns"></a>


**General information**  

|  |  | 
| --- |--- |
| Operation name | DeleteNamespace | 
| Type |  Iceberg REST Catalog API  | 
| REST path |  DELETE/iceberg/v1/catalogs/\$1catalog\$1/namespces/\$1ns\$1  | 
| IAM action |  glue:DeleteDatabase  | 
| Lake Formation permissions | ALL, DROP | 
| CloudTrail event |  glue:DeleteDatabase  | 
| Open API definition | https://github.com/apache/iceberg/blob/apache-iceberg-1.6.1/open-api/rest-catalog-open-api.yaml\$1L365 | 

****Considerations and limitations****
+ The catalog path parameter must follow the style described in the [Prefix and catalog path parameters](connect-glu-iceberg-rest.md#prefix-catalog-path-parameters) section.
+ You can specify only a single-level namespace in the REST Path parameter. For more in formation, see the [Namespace path parameter](connect-glu-iceberg-rest.md#ns-path-param) section.
+ If there are objects in the database, the operation will fail.
+ The API is asynchronous, and returns a transaction ID that that you can use for tracking using the `CheckTransactionStatus` API call.
+  The API can only be used if the `GetCatalog` API call indicates `use-extensions=true` in response. 

# StartDeleteNamespaceTransaction
<a name="start-delete-ns-transaction"></a>


**General information**  

|  |  | 
| --- |--- |
| Operation name | StartDeleteNamespaceTransaction | 
| Type |  AWS Glue extensions API  | 
| REST path |  DELETE /extensions/v1/catalogs/\$1catalog\$1/namespces/\$1ns\$1  | 
| IAM action |  glue:DeleteDatabase  | 
| Lake Formation permissions | ALL, DROP | 
| CloudTrail event |  glue:DeleteDatabase  | 
| Open API definition | https://github.com/awslabs/glue-extensions-for-iceberg/blob/main/glue-extensions-api.yaml\$1L85 | 

****Considerations and limitations****
+ The catalog path parameter must follow the style described in the [Prefix and catalog path parameters](connect-glu-iceberg-rest.md#prefix-catalog-path-parameters) section.
+ You can specify a only single-level namespace in the REST Path parameter. For more in formation, see the [Namespace path parameter](connect-glu-iceberg-rest.md#ns-path-param) section.
+ If there are objects in the database, the operation will fail.
+ The API is asynchronous, and returns a transaction ID that that you can use for tracking using the `CheckTransactionStatus` API call.
+  The API can only be used if the `GetCatalog` API call indicates `use-extensions=true` in response. 

# ListTables
<a name="list-tables"></a>


**General information**  

|  |  | 
| --- |--- |
| Operation name | ListTables | 
| Type |  Iceberg REST Catalog API  | 
| REST path |  GET /iceberg/v1/catalogs/\$1catalog\$1/namespaces/\$1ns\$1/tables  | 
| IAM action |  glue:GetTables  | 
| Lake Formation permissions | ALL, SELECT, DESCRIBE | 
| CloudTrail event |  glue:GetTables  | 
| Open API definition | https://github.com/apache/iceberg/blob/apache-iceberg-1.6.1/open-api/rest-catalog-open-api.yaml\$1L463 | 

****Considerations and limitations****
+ The catalog path parameter must follow the style described in the [Prefix and catalog path parameters](connect-glu-iceberg-rest.md#prefix-catalog-path-parameters) section.
+ You can specify only a single-level namespace in the REST Path parameter. For more in formation, see the [Namespace path parameter](connect-glu-iceberg-rest.md#ns-path-param) section.
+ All tables including non-Iceberg tables will be listed. To determine if a table can be loaded as an Iceberg table or not, call `LoadTable` operation.

# CreateTable
<a name="create-table"></a>


**General information**  

|  |  | 
| --- |--- |
| Operation name | CreateTable | 
| Type |  Iceberg REST Catalog API  | 
| REST path |  GET /iceberg/v1/catalogs/\$1catalog\$1/namespaces/\$1ns\$1/tables  | 
| IAM action |  glue:CreateTable  | 
| Lake Formation permissions | ALL, CREATE\$1TABLE | 
| CloudTrail event |  glue:CreateTable  | 
| Open API definition | https://github.com/apache/iceberg/blob/apache-iceberg-1.6.1/open-api/rest-catalog-open-api.yaml\$1L497 | 

****Considerations and limitations****
+ The catalog path parameter must follow the style described in the [Prefix and catalog path parameters](connect-glu-iceberg-rest.md#prefix-catalog-path-parameters) section.
+ You can specify only a single-level namespace in the REST Path parameter. For more in formation, see the [Namespace path parameter](connect-glu-iceberg-rest.md#ns-path-param) section.
+ `CreateTable` with staging is not supported. If the `stageCreate` query parameter is specified, the operation will fail.This means operation like `CREATE TABLE AS SELECT` is not supported, and you can use a combination of `CREATE TABLE` and `INSERT INTO` as a workaround.
+ The `CreateTable` API operation doesn't support the option `state-create = TRUE`.

# StartCreateTableTransaction
<a name="start-create-table-transaction"></a>


**General information**  

|  |  | 
| --- |--- |
| Operation name | CreateTable | 
| Type |  AWS Glue extensions API  | 
| REST path |  POST/extensions/v1/catalogs/\$1catalog\$1/namespaces/\$1ns\$1/tables  | 
| IAM action |  glue:CreateTable  | 
| Lake Formation permissions | ALL, CREATE\$1TABLE | 
| CloudTrail event |  glue:CreateTable  | 
| Open API definition | https://github.com/awslabs/glue-extensions-for-iceberg/blob/main/glue-extensions-api.yaml\$1L107 | 

****Considerations and limitations****
+ The catalog path parameter must follow the style described in the [Prefix and catalog path parameters](connect-glu-iceberg-rest.md#prefix-catalog-path-parameters) section.
+ You can specify only a single-level namespace in the REST path parameter. For more in formation, see the [Namespace path parameter](connect-glu-iceberg-rest.md#ns-path-param) section.
+ `CreateTable` with staging is not supported. If the `stageCreate` query parameter is specified, the operation will fail.This means operation like `CREATE TABLE AS SELECT` is not supported, and user should use a combination of `CREATE TABLE` and `INSERT INTO` to workaround.
+ The API is asynchronous, and returns a transaction ID that that you can use for tracking using the `CheckTransactionStatus` API call.
+  The API can only be used if the `GetCatalog` API call indicates `use-extensions=true` in response. 

# LoadTable
<a name="load-table"></a>


**General information**  

|  |  | 
| --- |--- |
| Operation name | LoadTable | 
| Type |  Iceberg REST Catalog API  | 
| REST path |  GET /iceberg/v1/catalogs/\$1catalog\$1/namespaces/\$1ns\$1/tables/\$1table\$1  | 
| IAM action |  glue:GetTable  | 
| Lake Formation permissions | ALL, SELECT, DESCRIBE | 
| CloudTrail event |  glue:GetTable  | 
| Open API definition | https://github.com/apache/iceberg/blob/apache-iceberg-1.6.1/open-api/rest-catalog-open-api.yaml\$1L616 | 

**Considerations**
+ The catalog path parameter must follow the style described in the [Prefix and catalog path parameters](connect-glu-iceberg-rest.md#prefix-catalog-path-parameters) section.
+ You can specify only a single level namespace in the REST Path parameter. For more in formation, see the [Namespace path parameter](connect-glu-iceberg-rest.md#ns-path-param) section.
+ `CreateTable` with staging is not supported. If the `stageCreate` query parameter is specified, the operation will fail.This means operation like `CREATE TABLE AS SELECT` is not supported, and user should use a combination of `CREATE TABLE` and `INSERT INTO` to workaround.
+ The API is asynchronous, and returns a transaction ID that that you can use for tracking using the `CheckTransactionStatus` API call.
+  The API can only be used if the `GetCatalog` API call indicates `use-extensions=true` in response. 

# ExtendedLoadTable
<a name="extended-load-table"></a>


**General information**  

|  |  | 
| --- |--- |
| Operation name | LoadTable | 
| Type |  AWS Glue extensions API  | 
| REST path |  GET /extensions/v1/catalogs/\$1catalog\$1/namespaces/\$1ns\$1/tables/\$1table\$1  | 
| IAM action |  glue:GetTable  | 
| Lake Formation permissions | ALL, SELECT, DESCRIBE | 
| CloudTrail event |  glue:GetTable  | 
| Open API definition | https://github.com/awslabs/glue-extensions-for-iceberg/blob/main/glue-extensions-api.yaml\$1L134 | 

**Considerations**
+ The catalog path parameter must follow the style described in the [Prefix and catalog path parameters](connect-glu-iceberg-rest.md#prefix-catalog-path-parameters) section.
+ You can specify only a single-level namespace in the REST Path parameter. For more in formation, see the [Namespace path parameter](connect-glu-iceberg-rest.md#ns-path-param) section.
+ Only `all` mode is supported for snapshots query parameter.
+ Compared to `LoadTable` API, the `ExtendedLoadTable` API differs in the following ways:
  +  Doesn't strictly enforce that all the fields to be available.
  + provides the following additional parameters in the config field of the response:   
**Additional parameters**    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/extended-load-table.html)

# PreplanTable
<a name="preplan-table"></a>


**General information**  

|  |  | 
| --- |--- |
| Operation name | PreplanTable | 
| Type |  AWS Glue extensions API  | 
| REST path |  POST /extensions/v1/catalogs/\$1catalog\$1/namespaces/\$1ns\$1/tables/\$1table\$1/preplan  | 
| IAM action |  glue:GetTable  | 
| Lake Formation permissions | ALL, SELECT, DESCRIBE | 
| CloudTrail event |  glue:GetTable  | 
| Open API definition | https://github.com/awslabs/glue-extensions-for-iceberg/blob/main/glue-extensions-api.yaml\$1L211 | 

**Considerations**
+ The catalog path parameter should follow the style described in the [Prefix and catalog path parameters](connect-glu-iceberg-rest.md#prefix-catalog-path-parameters) section.
+ You can specify only a single-level namespace in the REST Path parameter. For more in formation, see the [Namespace path parameter](connect-glu-iceberg-rest.md#ns-path-param) section.
+ Caller of this API should always determine if there are remaining results to fetch based on the page token. A response with empty page item but a pagination token is possible if the server side is still processing but is not able to produce any result in the given response time.
+  You can use this API only if the `ExtendedLoadTable` API response contains `aws.server-side-capabilities.scan-planning=true`. 

# PlanTable
<a name="plan-table"></a>


**General information**  

|  |  | 
| --- |--- |
| Operation name | PlanTable | 
| Type |  AWS Glue extensions API  | 
| REST path |  POST /extensions/v1/catalogs/\$1catalog\$1/namespaces/\$1ns\$1/tables/\$1table\$1/plan  | 
| IAM action |  glue:GetTable  | 
| Lake Formation permissions | ALL, SELECT, DESCRIBE | 
| CloudTrail event |  glue:GetTable  | 
| Open API definition | https://github.com/awslabs/glue-extensions-for-iceberg/blob/main/glue-extensions-api.yaml\$1L243 | 

**Considerations**
+ The catalog path parameter must follow the style described in the [Prefix and catalog path parameters](connect-glu-iceberg-rest.md#prefix-catalog-path-parameters) section.
+ You can specify only a single-level namespace in the REST Path parameter. For more in formation, see the [Namespace path parameter](connect-glu-iceberg-rest.md#ns-path-param) section.
+ Caller of this API should always determine if there are remaining results to fetch based on the page token. A response with empty page item but a pagination token is possible if the server side is still processing but is not able to produce any result in the given response time.
+  You can use this API only if the `ExtendedLoadTable` API response contains `aws.server-side-capabilities.scan-planning=true`. 

# TableExists
<a name="table-exists"></a>


**General information**  

|  |  | 
| --- |--- |
| Operation name | TableExists | 
| Type |  Iceberg REST Catalog API  | 
| REST path |  HEAD/iceberg/v1/catalogs/\$1catalog\$1/namespaces/\$1ns\$1/tables/\$1table\$1  | 
| IAM action |  glue:GetTable  | 
| Lake Formation permissions | ALL, SELECT, DESCRIBE | 
| CloudTrail event |  glue:GetTable  | 
| Open API definition | https://github.com/apache/iceberg/blob/apache-iceberg-1.6.1/open-api/rest-catalog-open-api.yaml\$1L833 | 

**Considerations**
+ The catalog path parameter must follow the style described in the [Prefix and catalog path parameters](connect-glu-iceberg-rest.md#prefix-catalog-path-parameters) section.
+ You can specify only a single-level namespace in the REST Path parameter. For more in formation, see the [Namespace path parameter](connect-glu-iceberg-rest.md#ns-path-param) section.

# UpdateTable
<a name="update-table"></a>


**General information**  

|  |  | 
| --- |--- |
| Operation name | UpdateTable | 
| Type |  Iceberg REST Catalog API  | 
| REST path |  POST /iceberg/v1/catalogs/\$1catalog\$1/namespaces/\$1ns\$1/tables/\$1table\$1  | 
| IAM action |  glue:UpdateTable  | 
| Lake Formation permissions | ALL, ALTER | 
| CloudTrail event |  glue:UpdateTable  | 
| Open API definition | https://github.com/apache/iceberg/blob/apache-iceberg-1.6.1/open-api/rest-catalog-open-api.yaml\$1L677 | 

**Considerations**
+ The catalog path parameter must follow the style described in the [Prefix and catalog path parameters](connect-glu-iceberg-rest.md#prefix-catalog-path-parameters) section.
+ You can specify only a single-level namespace in the REST Path parameter. For more in formation, see the [Namespace path parameter](connect-glu-iceberg-rest.md#ns-path-param) section.

# StartUpdateTableTransaction
<a name="start-update-table-transaction"></a>


**General information**  

|  |  | 
| --- |--- |
| Operation name | StartUpdateTableTransaction | 
| Type | AWS Glue extension API | 
| REST path |  POST/extensions/v1/catalogs/\$1catalog\$1/namespaces/\$1ns\$1/tables/\$1table\$1  | 
| IAM action |  glue:UpdateTable  | 
| Lake Formation permissions |  ALL, ALTER  | 
| CloudTrail event |  glue:UpdateTable  | 
| Open API definition | https://github.com/awslabs/glue-extensions-for-iceberg/blob/main/glue-extensions-api.yaml\$1L154 | 

**Considerations**
+ The catalog path parameter must follow the style described in the [Prefix and catalog path parameters](connect-glu-iceberg-rest.md#prefix-catalog-path-parameters) section.
+ You can specify only a single-level namespace in the REST Path parameter. For more in formation, see the [Namespace path parameter](connect-glu-iceberg-rest.md#ns-path-param) section.
+ The API is asynchronous, and returns a transaction ID that that you can use for tracking using the `CheckTransactionStatus` API call.
+  A `RenamTable` operation can also be performed through this API. When that happens, the caller must also ahve glue:CreateTable or LakeFormation CREATE\$1TABLE permission for the table to be renamed to. 
+  You can use this API only if the `ExtendedLoadTable` API response contains `aws.server-side-capabilities.scan-planning=true`. 

# DeleteTable
<a name="delete-table"></a>


**General information**  

|  |  | 
| --- |--- |
| Operation name | DeleteTable | 
| Type |  Iceberg REST Catalog API  | 
| REST path |  DELETE/iceberg/v1/catalogs/\$1catalog\$1/namespaces/\$1ns\$1/tables/\$1table\$1  | 
| IAM action |  glue:DeleteTable  | 
| Lake Formation permissions | ALL, DROP | 
| CloudTrail event |  glue:DeleteTable  | 
| Open API definition | https://github.com/apache/iceberg/blob/apache-iceberg-1.6.1/open-api/rest-catalog-open-api.yaml\$1L793 | 

**Considerations**
+ The catalog path parameter should follow the style described in the [Prefix and catalog path parameters](connect-glu-iceberg-rest.md#prefix-catalog-path-parameters) section.
+ You can specify only a single-level namespace in the REST Path parameter. For more in formation, see the [Namespace path parameter](connect-glu-iceberg-rest.md#ns-path-param) section.
+ `DeleteTable` API operation supports a purge option. When purge is set to `true`, the table data is deleted, otherwise data is not deleted. For tables in Amazon S3, the operation does not delete table data. The operation fails when table is stored in Amazon S3, and `purge = TRUE,` . 

  For tables stored in Amazon Redshift managed storage, the operation will delete table data, similar to `DROP TABLE`behavior in Amazon Redshift. The operation fails when table is stored in Amazon Redshift and `purge = FALSE`.
+ `purgeRequest=true` is not supported. 

# StartDeleteTableTransaction
<a name="start-delete-table-transaction"></a>


**General information**  

|  |  | 
| --- |--- |
| Operation name | StartDeleteTableTransaction | 
| Type |  AWS Glue extensions API  | 
| REST path |  DELETE /extensions/v1/catalogs/\$1catalog\$1/namespaces/\$1ns\$1/tables/\$1table\$1  | 
| IAM action |  glue:DeleteTable  | 
| Lake Formation permissions | ALL, DROP | 
| CloudTrail event |  glue:DeleteTable  | 
| Open API definition | https://github.com/apache/iceberg/blob/apache-iceberg-1.6.1/open-api/rest-catalog-open-api.yaml\$1L793 | 

**Considerations**
+ The catalog path parameter must follow the style described in the [Prefix and catalog path parameters](connect-glu-iceberg-rest.md#prefix-catalog-path-parameters) section.
+ You can specify only a single-level namespace in the REST Path parameter. For more in formation, see the [Namespace path parameter](connect-glu-iceberg-rest.md#ns-path-param) section.
+ `purgeRequest=false` is not supported. 
+  The API is asynchronous, and returns a transaction ID that can be tracked through `CheckTransactionStatus`. 

# CheckTransactionStatus
<a name="check-transaction-status"></a>


**General information**  

|  |  | 
| --- |--- |
| Operation name | CheckTransactionStatus | 
| Type |  AWS Glue extensions API  | 
| REST path |  POST/extensions/v1/transactions/status  | 
| IAM action |  The same permission as the action that initiates the transaction  | 
| Lake Formation permissions | The same permission as the action that initiates the transaction | 
| Open API definition | https://github.com/awslabs/glue-extensions-for-iceberg/blob/main/glue-extensions-api.yaml\$1L273 | 

**Considerations**
+ The catalog path parameter must follow the style described in the [Prefix and catalog path parameters](connect-glu-iceberg-rest.md#prefix-catalog-path-parameters) section.

# Connecting to Data Catalog from a standalone Spark application
<a name="connect-gludc-spark"></a>

You can connect to the Data Catalog from a stand application using an Apache Iceberg connector. 

1. Create an IAM role for Spark application.

1. Connect to AWS Glue Iceberg Rest endpoint using Iceberg connector.

   ```
   # configure your application. Refer to https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html for best practices on configuring environment variables.
   export AWS_ACCESS_KEY_ID=$(aws configure get appUser.aws_access_key_id)
   export AWS_SECRET_ACCESS_KEY=$(aws configure get appUser.aws_secret_access_key)
   export AWS_SESSION_TOKEN=$(aws configure get appUser.aws_secret_token)
   
   export AWS_REGION=us-east-1
   export REGION=us-east-1
   export AWS_ACCOUNT_ID = {specify your aws account id here}
   
   ~/spark-3.5.3-bin-hadoop3/bin/spark-shell \
       --packages org.apache.iceberg:iceberg-spark-runtime-3.4_2.12:1.6.0 \
       --conf "spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions" \
       --conf "spark.sql.defaultCatalog=spark_catalog" \
       --conf "spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkCatalog" \
       --conf "spark.sql.catalog.spark_catalog.type=rest" \
       --conf "spark.sql.catalog.spark_catalog.uri=https://glue.us-east-1.amazonaws.com/iceberg" \
       --conf "spark.sql.catalog.spark_catalog.warehouse = {AWS_ACCOUNT_ID}" \
       --conf "spark.sql.catalog.spark_catalog.rest.sigv4-enabled=true" \
       --conf "spark.sql.catalog.spark_catalog.rest.signing-name=glue" \
       --conf "spark.sql.catalog.spark_catalog.rest.signing-region=us-east-1" \
       --conf "spark.sql.catalog.spark_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO" \
       --conf "spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.SimpleAWSCredentialProvider"
   ```

1. Query data in the Data Catalog.

   ```
   spark.sql("create database myicebergdb").show()
   spark.sql("""CREATE TABLE myicebergdb.mytbl (name string) USING iceberg location 's3://bucket_name/mytbl'""")
   spark.sql("insert into myicebergdb.mytbl values('demo') ").show()
   ```

# Data mapping between Amazon Redshift and Apache Iceberg
<a name="data-mapping-rs-iceberg"></a>

Redshift and Iceberg support various data types. The following compatibility matrix outlines the support and limitations when mapping data between these two data systems. Please refer to [Amazon Redshift Data Types](https://docs.aws.amazon.com/redshift/latest/dg/c_Supported_data_types.html) and [Apache Iceberg Table Specifications](https://iceberg.apache.org/spec/#primitive-types) for more details on supported data types in respective data systems.


| Redshift data type | Aliases | Iceberg data type | 
| --- | --- | --- | 
| SMALLINT | INT2 | int | 
| INTEGER | INT, INT4 | int | 
| BIGINT | INT8 | long | 
| DECIMAL | NUMERIC | decimal | 
| REAL | FLOAT4 | float | 
| REAL | FLOAT4 | float | 
| DOUBLE PRECISION | FLOAT8, FLOAT | double | 
| CHAR | CHARACTER, NCHAR | string | 
| VARCHAR | CHARACTER VARYING, NVARCHAR | string | 
| BPCHAR |  | string | 
| TEXT |  | string | 
| DATE |  | date | 
| TIME | TIME WITHOUT TIMEZONE | time | 
| TIME | TIME WITH TIMEZONE | not supported | 
| TIMESTAMP | TIMESTAMP WITHOUT TIMEZONE | TIMESTAMP | 
| TIMESTAMPZ | TIMESTAMP WITH TIMEZONE | TIMESTAMPZ | 
| INTERVAL YEAR TO MONTH |  | Not supported | 
| INTERVAL DAY TO SECOND |  | Not supported | 
| BOOLEAN | BOOL | bool | 
| HLLSKETCH |  | Not supported | 
| SUPER |  | Not supported | 
| VARBYTE | VARBINARY, BINARY VARYING | binary | 
| GEOMETRY |  | Not supported | 
| GEOGRAPHY |  | Not supported | 

# Considerations and limitations when using AWS Glue Iceberg REST Catalog APIs
<a name="limitation-glue-iceberg-rest-api"></a>

Following are the considerations and limitations when using the Apache Iceberg REST Catalog Data Definition Language (DDL) operation behavior.

**Considerations**
+  **`RenameTable` API behavior** – The `RenameTable` operation is supported in tables in Amazon Redshift but not in Amazon S3. 
+  **DDL operations for namespaces and tables in Amazon Redshift** – Create, Update, Delete operations for namespaces and tables in Amazon Redshift are asynchronous operations because they are dependent on when Amazon Redshift managed workgroup is available and whether a conflicting DDL and DML transaction is in progress and operation has to wait for lock and then attempt to commit changes. 

**Limitations**
+  View APIs in the Apache Iceberg REST specification are not supported in AWS Glue Iceberg REST Catalog.