

# Security for Apache Livy with Amazon EMR on EKS
<a name="job-runs-apache-livy-security"></a>

See the following topics to learn more about configuring security for Apache Livy with Amazon EMR on EKS. These options include using transport-layer security, role-based access control, which is access based on a person's role within an organization, and using IAM roles, which provide access to resources, based on granted permissions.

**Topics**
+ [

# Setting up a secure Apache Livy endpoint with TLS/SSL
](job-runs-apache-livy-secure-endpoint.md)
+ [

# Setting up the Apache Livy and Spark application permissions with role-based access control (RBAC)
](job-runs-apache-livy-rbac.md)
+ [

# Setting up access permissions with IAM roles for service accounts (IRSA)
](job-runs-apache-livy-irsa.md)

# Setting up a secure Apache Livy endpoint with TLS/SSL
<a name="job-runs-apache-livy-secure-endpoint"></a>

See the following sections to learn more about setting up Apache Livy for Amazon EMR on EKS with end-to-end TLS and SSL encryption.

## Setting up TLS and SSL encryption
<a name="job-runs-apache-livy-security-tls"></a>

To set up SSL encryption on your Apache Livy endpoint, follow these steps.
+ [Install the Secrets Store CSI Driver and AWS Secrets and Configuration Provider (ASCP)](https://docs.aws.amazon.com/secretsmanager/latest/userguide/integrating_csi_driver.html) – the Secrets Store CSI Driver and ASCP securely store Livy's JKS certificates and passwords that the Livy server pod needs to enable SSL. You can also install just the Secrets Store CSI Driver and use any other supported secrets provider.
+ [Create an ACM certificate](https://docs.aws.amazon.com/acm/latest/userguide/gs-acm-request-public.html) – this certificate is required to secure the connection between the client and the ALB endpoint.
+ Set up a JKS certificate, key password, and keystore password for AWS Secrets Manager – required to secure the connection between the ALB endpoint and the Livy server.
+ Add permissions to the Livy service account to retrieve secrets from AWS Secrets Manager – the Livy server needs these permissions to retrieve secrets from ASCP and add the Livy configurations to secure the Livy server. To add IAM permissions to a service account, see Setting up access permissions with IAM roles for service accounts (IRSA).

### Setting up a JKS certificate with a key and a keystore password for AWS Secrets Manager
<a name="job-runs-apache-livy-jks-certificate"></a>

Follow these steps to set up a JKS certificate with a key and a keystore password.

1. Generate a keystore file for the Livy server.

   ```
   keytool -genkey -alias <host> -keyalg RSA -keysize 2048 –dname CN=<host>,OU=hw,O=hw,L=<your_location>,ST=<state>,C=<country> –keypass <keyPassword> -keystore <keystore_file> -storepass <storePassword> --validity 3650
   ```

1. Create a certificate.

   ```
   keytool -export -alias <host> -keystore mykeystore.jks -rfc -file mycertificate.cert -storepass <storePassword>
   ```

1. Create a truststore file.

   ```
   keytool -import -noprompt -alias <host>-file <cert_file> -keystore <truststore_file> -storepass <truststorePassword>
   ```

1. Save the JKS certificate in AWS Secrets Manager. Replace `livy-jks-secret` with your secret and `fileb://mykeystore.jks` with the path to your keystore JKS certificate.

   ```
   aws secretsmanager create-secret \ 
   --name livy-jks-secret \
   --description "My Livy keystore JKS secret" \
   --secret-binary fileb://mykeystore.jks
   ```

1. Save the keystore and key password in Secrets Manager. Make sure to use your own parameters.

   ```
   aws secretsmanager create-secret \
   --name livy-jks-secret \
   --description "My Livy key and keystore password secret" \
   --secret-string "{\"keyPassword\":\"<test-key-password>\",\"keyStorePassword\":\"<test-key-store-password>\"}"
   ```

1. Create a Livy server namespace with the following command.

   ```
   kubectl create ns <livy-ns>
   ```

1. Create the `ServiceProviderClass` object for the Livy server that has the JKS certificate and the passwords.

   ```
   cat >livy-secret-provider-class.yaml << EOF
   apiVersion: secrets-store.csi.x-k8s.io/v1
   kind: SecretProviderClass
   metadata:
     name: aws-secrets
   spec:
     provider: aws
     parameters:
       objects: |
           - objectName: "livy-jks-secret"
             objectType: "secretsmanager"
           - objectName: "livy-passwords"
             objectType: "secretsmanager"
                        
   EOF
   kubectl apply -f livy-secret-provider-class.yaml -n <livy-ns>
   ```

## Getting started with SSL-enabled Apache Livy
<a name="job-runs-apache-livy-ssl-enabled-getting-started"></a>

After enabling SSL on your Livy server, you must set up the `serviceAccount` to have access to the `keyStore` and `keyPasswords` secrets on AWS Secrets Manager.

1. Create the Livy server namespace.

   ```
   kubectl create namespace <livy-ns>
   ```

1. Set up the Livy service account to have access to the secrets in Secrets Manager. For more information about setting up IRSA, see [Setting up IRSA while installing Apache Livy](https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/job-runs-apache-livy-irsa.html#job-runs-apache-livy-irsa).

   ```
   aws ecr get-login-password \--region region-id | helm registry login \
   --username AWS \
   --password-stdin ECR-registry-account.dkr.ecr.region-id.amazonaws.com
   ```

1. Install Livy. For the Helm chart --version parameter, use your Amazon EMR release label, such as `7.1.0`. You must also replace the Amazon ECR registry account ID and Region ID with your own IDs. You can find the corresponding `ECR-registry-account` value for your AWS Region from [Amazon ECR registry accounts by Region](https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/docker-custom-images-tag.html#docker-custom-images-ECR).

   ```
   helm install <livy-app-name> \
     oci://895885662937.dkr.ecr.region-id.amazonaws.com/livy \
     --version 7.12.0 \
     --namespace livy-namespace-name \
     --set image=<ECR-registry-account.dkr.ecr>.<region>.amazonaws.com/livy/emr-7.12.0:latest \
     --set sparkNamespace=spark-namespace \
     --set ssl.enabled=true
     --set ssl.CertificateArn=livy-acm-certificate-arn
     --set ssl.secretProviderClassName=aws-secrets
     --set ssl.keyStoreObjectName=livy-jks-secret
     --set ssl.keyPasswordsObjectName=livy-passwords
     --create-namespace
   ```

1. Continue from step 5 of the [Installing Apache Livy on Amazon EMR on EKS](https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/job-runs-apache-livy-setup.html#job-runs-apache-livy-install).

# Setting up the Apache Livy and Spark application permissions with role-based access control (RBAC)
<a name="job-runs-apache-livy-rbac"></a>

To deploy Livy, Amazon EMR on EKS creates a server service account and role and a Spark service account and role. These roles must have the necessary RBAC permissions to finish setup and run Spark applications.

**RBAC permissions for the server service account and role**

Amazon EMR on EKS creates the Livy server service account and role to manage Livy sessions for Spark jobs and routing traffic to and from the ingress and other resources.

The default name for this service account is `emr-containers-sa-livy`. It must have the following permissions.

```
rules:
- apiGroups:
  - ""
  resources:
  - "namespaces"
  verbs:
  - "get"
- apiGroups:
  - ""
  resources:
  - "serviceaccounts"
    "services"
    "configmaps"
    "events"
    "pods"
    "pods/log"
  verbs:
  - "get"
    "list"
    "watch"
    "describe"
    "create"
    "edit"
    "delete"
    "deletecollection"
    "annotate"
    "patch"
    "label"
 - apiGroups:
   - ""
   resources:
   - "secrets"
   verbs:
   - "create"
     "patch"
     "delete"
     "watch"
 - apiGroups:
   - ""
   resources:
   - "persistentvolumeclaims"
   verbs:
   - "get"
     "list"
     "watch"
     "describe"
     "create"
     "edit"
     "delete"
     "annotate"
     "patch"
     "label"
```

**RBAC permissions for the spark service account and role**

A Spark driver pod needs a Kubernetes service account in the same namespace as the pod. This service account needs permissions to manage executor pods and any resources required by the driver pod. Unless the default service account in the namespace has the required permissions, the driver fails and exits. The following RBAC permissions are required.

```
rules:
- apiGroups:
  - ""
    "batch"
    "extensions"
    "apps"
  resources:
  - "configmaps"
    "serviceaccounts"
    "events"
    "pods"
    "pods/exec"
    "pods/log"
    "pods/portforward"
    "secrets"
    "services"
    "persistentvolumeclaims"
    "statefulsets"
  verbs:
  - "create"
    "delete"
    "get"
    "list"
    "patch"
    "update"
    "watch"
    "describe"
    "edit"
    "deletecollection"
    "patch"
    "label"
```

# Setting up access permissions with IAM roles for service accounts (IRSA)
<a name="job-runs-apache-livy-irsa"></a>

By default, the Livy server and Spark application's driver and executors don't have access to AWS resources. The server service account and spark service account controls access to AWS resources for the Livy server and spark application's pods. To grant access, you need to map the service accounts with an IAM role that has the necessary AWS permissions.

You can set up IRSA mapping before you install Apache Livy, during the installation, or after you finish the installation.

## Setting up IRSA while installing Apache Livy (for server service account)
<a name="job-runs-apache-livy-irsa"></a>

**Note**  
This mapping is supported only for the server service account.

1. Make sure that you have finished [setting up Apache Livy for Amazon EMR on EKS](https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/job-runs-apache-livy-setup.html) and are in the middle of [installing Apache Livy with Amazon EMR on EKS](https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/job-runs-apache-livy-install.html). 

1. Create a Kubernetes namespace for the Livy server. In this example, the name of the namespace is `livy-ns`.

1. Create an IAM policy that includes the permissions for the AWS services for which you want your pods to access. The following example creates an IAM policy of getting Amazon S3 resources for the Spark entry point.

   ```
   cat >my-policy.json <<EOF{
   "Version": "2012-10-17",		 	 	 
       "Statement": [
           {
   "Effect": "Allow",
               "Action": "s3:GetObject",
               "Resource": "arn:aws:s3:::my-spark-entrypoint-bucket"
           }
       ]
   }
   EOF
   
   aws iam create-policy --policy-name my-policy --policy-document file://my-policy.json
   ```

1. Use the following command to set your AWS account ID to a variable.

   ```
   account_id=$(aws sts get-caller-identity --query "Account" --output text)
   ```

1. Set the OpenID Connect (OIDC) identity provider of your cluster to an environment variable.

   ```
   oidc_provider=$(aws eks describe-cluster --name my-cluster --region $AWS_REGION --query "cluster.identity.oidc.issuer" --output text | sed -e "s/^https:\/\///")
   ```

1. Set variables for the namespace and name of the service account. Be sure to use your own values.

   ```
   export namespace=default
   export service_account=my-service-account
   ```

1. Create a trust policy file with the following command. If you want to grant access of the role to all service accounts within a namespace, copy the following command, and replace `StringEquals` with `StringLike` and replace `$service_account` with `*`.

   ```
   cat >trust-relationship.json <<EOF
   {
     "Version": "2012-10-17",		 	 	 
     "Statement": [
       {
         "Effect": "Allow",
         "Principal": {
           "Federated": "arn:aws:iam::$account_id:oidc-provider/$oidc_provider"
         },
         "Action": "sts:AssumeRoleWithWebIdentity",
         "Condition": {
           "StringEquals": {
             "$oidc_provider:aud": "sts.amazonaws.com",
             "$oidc_provider:sub": "system:serviceaccount:$namespace:$service_account"
           }
         }
       }
     ]
   }
   EOF
   ```

1. Create the role.

   ```
   aws iam create-role --role-name my-role --assume-role-policy-document file://trust-relationship.json --description "my-role-description"
   ```

1. Use the following Helm install command to set the `serviceAccount.executionRoleArn` to map IRSA. The following is an example of the Helm install command. You can find the corresponding `ECR-registry-account` value for your AWS Region from [Amazon ECR registry accounts by Region](https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/docker-custom-images-tag.html#docker-custom-images-ECR).

   ```
   helm install livy-demo \
     oci://895885662937.dkr.ecr.us-west-2.amazonaws.com/livy \
     --version 7.12.0 \
     --namespace livy-ns \
     --set image=ECR-registry-account.dkr.ecr.region-id.amazonaws.com/livy/emr-7.12.0:latest \
     --set sparkNamespace=spark-ns \
     --set serviceAccount.executionRoleArn=arn:aws:iam::123456789012:role/my-role
   ```

## Mapping IRSA to a Spark service account
<a name="job-runs-apache-livy-irsa-spark"></a>

Before you map IRSA to a Spark service account, make sure that you have completed the following items:
+ Make sure that you have finished [setting up Apache Livy for Amazon EMR on EKS](https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/job-runs-apache-livy-setup.html) and are in the middle of [installing Apache Livy with Amazon EMR on EKS](https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/job-runs-apache-livy-install.html). 
+ You must have an existing IAM OpenID Connect (OIDC) provdider for your cluster. To see if you already have one or how to create one, see [Create an IAM OIDC provider for your cluster](https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html).
+ Make sure that you have installed version 0.171.0 or later of the `eksctl` CLI installed or AWS CloudShell. To install or update `eksctl`, see [Installation](https://eksctl.io/installation/) of the `eksctl` documentation.

Follow these steps to map IRSA to your Spark service account:

1. Use the following command to get the Spark service account.

   ```
   SPARK_NAMESPACE=<spark-ns>
   LIVY_APP_NAME=<livy-app-name>
   kubectl --namespace $SPARK_NAMESPACE describe sa -l "app.kubernetes.io/instance=$LIVY_APP_NAME" | awk '/^Name:/ {print $2}'
   ```

1. Set your variables for the namespace and name of the service account.

   ```
   export namespace=default
   export service_account=my-service-account
   ```

1. Use the following command to create a trust policy file for the IAM role. The following example gives permission to all service accounts within the namespace to use the role. To do so, replace `StringEquals` with `StringLike` and replace `$service_account` with \$1.

   ```
   cat >trust-relationship.json <<EOF
   {
     "Version": "2012-10-17",		 	 	 
     "Statement": [
       {
         "Effect": "Allow",
         "Principal": {
           "Federated": "arn:aws:iam::$account_id:oidc-provider/$oidc_provider"
         },
         "Action": "sts:AssumeRoleWithWebIdentity",
         "Condition": {
           "StringEquals": {
             "$oidc_provider:aud": "sts.amazonaws.com",
             "$oidc_provider:sub": "system:serviceaccount:$namespace:$service_account"
           }
         }
       }
     ]
   }
   EOF
   ```

1. Create the role.

   ```
   aws iam create-role --role-name my-role --assume-role-policy-document file://trust-relationship.json --description "my-role-description"
   ```

1. Map the server or spark service account with the following `eksctl` command. Make sure to use your own values.

   ```
    eksctl create iamserviceaccount --name spark-sa \
    --namespace spark-namespace --cluster livy-eks-cluster \
    --attach-role-arn arn:aws:iam::0123456789012:role/my-role \
    --approve --override-existing-serviceaccounts
   ```