

# Amazon Athena Redis OSS connector
<a name="connectors-redis"></a>

The Amazon Athena Redis OSS connector enables Amazon Athena to communicate with your Redis OSS instances so that you can query your Redis OSS data with SQL. You can use the AWS Glue Data Catalog to map your Redis OSS key-value pairs into virtual tables.

Unlike traditional relational data stores, Redis OSS does not have the concept of a table or a column. Instead, Redis OSS offers key-value access patterns where the key is essentially a `string` and the value is a `string`, `z-set`, or `hmap`.

You can use the AWS Glue Data Catalog to create schema and configure virtual tables. Special table properties tell the Athena Redis OSS connector how to map your Redis OSS keys and values into a table. For more information, see [Setting up databases and tables in AWS Glue](#connectors-redis-setting-up-databases-and-tables-in-glue) later in this document.

This connector does not use Glue Connections to centralize configuration properties in Glue. Connection configuration is done through Lambda.

If you have Lake Formation enabled in your account, the IAM role for your Athena federated Lambda connector that you deployed in the AWS Serverless Application Repository must have read access in Lake Formation to the AWS Glue Data Catalog.

The Amazon Athena Redis OSS connector supports Amazon MemoryDB and Amazon ElastiCache (Redis OSS).

## Prerequisites
<a name="connectors-redis-prerequisites"></a>
+ Deploy the connector to your AWS account using the Athena console or the AWS Serverless Application Repository. For more information, see [Create a data source connection](connect-to-a-data-source.md) or [Use the AWS Serverless Application Repository to deploy a data source connector](connect-data-source-serverless-app-repo.md).
+ Set up a VPC and a security group before you use this connector. For more information, see [Create a VPC for a data source connector or AWS Glue connection](athena-connectors-vpc-creation.md).

## Parameters
<a name="connectors-redis-parameters"></a>

Use the parameters in this section to configure the Redis connector.
+ **spill\$1bucket** – Specifies the Amazon S3 bucket for data that exceeds Lambda function limits.
+ **spill\$1prefix** – (Optional) Defaults to a subfolder in the specified `spill_bucket` called `athena-federation-spill`. We recommend that you configure an Amazon S3 [storage lifecycle](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lifecycle-mgmt.html) on this location to delete spills older than a predetermined number of days or hours.
+ **spill\$1put\$1request\$1headers** – (Optional) A JSON encoded map of request headers and values for the Amazon S3 `putObject` request that is used for spilling (for example, `{"x-amz-server-side-encryption" : "AES256"}`). For other possible headers, see [PutObject](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html) in the *Amazon Simple Storage Service API Reference*.
+ **kms\$1key\$1id** – (Optional) By default, any data that is spilled to Amazon S3 is encrypted using the AES-GCM authenticated encryption mode and a randomly generated key. To have your Lambda function use stronger encryption keys generated by KMS like `a7e63k4b-8loc-40db-a2a1-4d0en2cd8331`, you can specify a KMS key ID.
+ **disable\$1spill\$1encryption** – (Optional) When set to `True`, disables spill encryption. Defaults to `False` so that data that is spilled to S3 is encrypted using AES-GCM – either using a randomly generated key or KMS to generate keys. Disabling spill encryption can improve performance, especially if your spill location uses [server-side encryption](https://docs.aws.amazon.com/AmazonS3/latest/userguide/serv-side-encryption.html).
+ **glue\$1catalog** – (Optional) Use this option to specify a [cross-account AWS Glue catalog](data-sources-glue-cross-account.md). By default, the connector attempts to get metadata from its own AWS Glue account.

## Setting up databases and tables in AWS Glue
<a name="connectors-redis-setting-up-databases-and-tables-in-glue"></a>

To enable an AWS Glue table for use with Redis OSS, you can set the following table properties on the table: `redis-endpoint`, `redis-value-type`, and either `redis-keys-zset` or `redis-key-prefix`.

In addition, any AWS Glue database that contains Redis OSS tables must have a `redis-db-flag` in the URI property of the database. To set the `redis-db-flag` URI property, use the AWS Glue console to edit the database.

The following list describes the table properties.
+ **redis-endpoint** – (Required) The *hostname*`:`*port*`:`*password* of the Redis OSS server that contains data for this table (for example, `athena-federation-demo.cache.amazonaws.com:6379`) Alternatively, you can store the endpoint, or part of the endpoint, in AWS Secrets Manager by using \$1\$1*Secret\$1Name*\$1 as the table property value.

**Note**  
To use the Athena Federated Query feature with AWS Secrets Manager, the VPC connected to your Lambda function should have [internet access](https://aws.amazon.com/premiumsupport/knowledge-center/internet-access-lambda-function/) or a [VPC endpoint](https://docs.aws.amazon.com/secretsmanager/latest/userguide/vpc-endpoint-overview.html) to connect to Secrets Manager.
+ **redis-keys-zset** – (Required if `redis-key-prefix` is not used) A comma-separated list of keys whose value is a [zset](https://redis.com/ebook/part-2-core-concepts/chapter-3-commands-in-redis/3-5-sorted-sets/) (for example, `active-orders,pending-orders`). Each of the values in the zset is treated as a key that is part of the table. Either the `redis-keys-zset` property or the `redis-key-prefix` property must be set.
+ **redis-key-prefix** – (Required if `redis-keys-zset` is not used) A comma separated list of key prefixes to scan for values in the table (for example, `accounts-*,acct-`). Either the `redis-key-prefix` property or the `redis-keys-zset` property must be set.
+ **redis-value-type** – (Required) Defines how the values for the keys defined by either `redis-key-prefix` or `redis-keys-zset` are mapped to your table. A literal maps to a single column. A zset also maps to a single column, but each key can store many rows. A hash enables each key to be a row with multiple columns (for example, a hash, literal, or zset.)
+ **redis-ssl-flag** – (Optional) When `True`, creates a Redis connection that uses SSL/TLS. The default is `False`.
+ **redis-cluster-flag** – (Optional) When `True`, enables support for clustered Redis instances. The default is `False`.
+ **redis-db-number** – (Optional) Applies only to standalone, non-clustered instances.) Set this number (for example 1, 2, or 3) to read from a non-default Redis database. The default is Redis logical database 0. This number does not refer to a database in Athena or AWS Glue, but to a Redis logical database. For more information, see [SELECT index](https://redis.io/commands/select) in the Redis documentation.

## Data types
<a name="connectors-redis-data-types"></a>

The Redis OSS connector supports the following data types. Redis OSS streams are not supported.
+ [String](https://redis.com/ebook/part-1-getting-started/chapter-1-getting-to-know-redis/1-2-what-redis-data-structures-look-like/1-2-1-strings-in-redis/)
+ [Hash](https://redis.com/ebook/part-1-getting-started/chapter-1-getting-to-know-redis/1-2-what-redis-data-structures-look-like/1-2-4-hashes-in-redis/)
+ Sorted Set ([ZSet](https://redis.com/ebook/part-2-core-concepts/chapter-3-commands-in-redis/3-5-sorted-sets/))

All Redis OSS values are retrieved as the `string` data type. Then they are converted to one of the following Apache Arrow data types based on how your tables are defined in the AWS Glue Data Catalog.


****  

| AWS Glue data type | Apache Arrow data type | 
| --- | --- | 
| int | INT | 
| string | VARCHAR | 
| bigint | BIGINT | 
| double | FLOAT8 | 
| float | FLOAT4 | 
| smallint | SMALLINT | 
| tinyint | TINYINT | 
| boolean | BIT | 
| binary | VARBINARY | 

## Required Permissions
<a name="connectors-redis-required-permissions"></a>

For full details on the IAM policies that this connector requires, review the `Policies` section of the [athena-redis.yaml](https://github.com/awslabs/aws-athena-query-federation/blob/master/athena-redis/athena-redis.yaml) file. The following list summarizes the required permissions.
+ **Amazon S3 write access** – The connector requires write access to a location in Amazon S3 in order to spill results from large queries.
+ **Athena GetQueryExecution** – The connector uses this permission to fast-fail when the upstream Athena query has terminated.
+ **AWS Glue Data Catalog** – The Redis connector requires read only access to the AWS Glue Data Catalog to obtain schema information.
+ **CloudWatch Logs** – The connector requires access to CloudWatch Logs for storing logs.
+ **AWS Secrets Manager read access** – If you choose to store Redis endpoint details in Secrets Manager, you must grant the connector access to those secrets.
+ **VPC access** – The connector requires the ability to attach and detach interfaces to your VPC so that it can connect to it and communicate with your Redis instances.

## Performance
<a name="connectors-redis-performance"></a>

The Athena Redis OSS connector attempts to parallelize queries against your Redis OSS instance according to the type of table that you have defined (for example, zset keys or prefix keys).

The Athena Redis connector performs predicate pushdown to decrease the data scanned by the query. However, queries containing a predicate against the primary key fail with timeout. `LIMIT` clauses reduce the amount of data scanned, but if you don't provide a predicate, you should expect `SELECT` queries with a `LIMIT` clause to scan at least 16 MB of data. The Redis connector is resilient to throttling due to concurrency.

## Passthrough queries
<a name="connectors-redis-passthrough-queries"></a>

The Redis connector supports [passthrough queries](federated-query-passthrough.md). You can use this feature to run queries that use Lua script on Redis databases. 

To create passthrough queries with Redis, use the following syntax:

```
SELECT * FROM TABLE(
        system.script(
            script => 'return redis.[call|pcall](query_script)',
            keys => '[key_pattern]',
            argv => '[script_arguments]'
))
```

The following example runs a Lua script to get the value at key `l:a`.

```
SELECT * FROM TABLE(
        system.script(
            script => 'return redis.call("GET", KEYS[1])',
            keys => '[l:a]',
            argv => '[]'
))
```

## License information
<a name="connectors-redis-license-information"></a>

The Amazon Athena Redis connector project is licensed under the [Apache-2.0 License](https://www.apache.org/licenses/LICENSE-2.0.html).

## Additional resources
<a name="connectors-redis-additional-resources"></a>

For additional information about this connector, visit [the corresponding site](https://github.com/awslabs/aws-athena-query-federation/tree/master/athena-redis) on GitHub.com.