

# Configure and launch an EMR cluster with LDAP
<a name="ldap-setup"></a>

This section covers how to configure Amazon EMR for use with LDAP authentication.

**Topics**
+ [Add AWS Secrets Manager permissions to the Amazon EMR instance role](ldap-setup-asm.md)
+ [Create the Amazon EMR security configuration for LDAP integration](ldap-setup-security.md)
+ [Launch an EMR cluster that authenticates with LDAP](ldap-setup-launch.md)

# Add AWS Secrets Manager permissions to the Amazon EMR instance role
<a name="ldap-setup-asm"></a>

Amazon EMR uses an IAM service role to perform actions on your behalf to provision and manage clusters. The service role for cluster EC2 instances, also called *the EC2 instance profile for Amazon EMR*, is a special type of service role that Amazon EMR assigns to every EC2 instance in a cluster at launch.

To define permissions for an EMR cluster to interact with Amazon S3 data and other AWS services, define a custom Amazon EC2 instance profile instead of the `EMR_EC2_DefaultRole` when you launch your cluster. For more information, see [Service role for cluster EC2 instances (EC2 instance profile)](emr-iam-role-for-ec2.md) and [Customize IAM roles with Amazon EMR](emr-iam-roles-custom.md).

Add the following statements to the default EC2 instance profile to allow Amazon EMR to tag sessions and access the AWS Secrets Manager that stores LDAP certificates.

```
    {
      "Sid": "AllowAssumeOfRolesAndTagging",
      "Effect": "Allow",
      "Action": ["sts:TagSession", "sts:AssumeRole"],
      "Resource": [
        "arn:aws:iam::111122223333:role/LDAP_DATA_ACCESS_ROLE_NAME",
        "arn:aws:iam::111122223333:role/LDAP_USER_ACCESS_ROLE_NAME"
      ]
    },
    {
        "Sid": "AllowSecretsRetrieval",
        "Effect": "Allow",
        "Action": "secretsmanager:GetSecretValue",
        "Resource": [
            "arn:aws:secretsmanager:us-east-1:111122223333:secret:LDAP_SECRET_NAME*",
            "arn:aws:secretsmanager:us-east-1:111122223333:secret:ADMIN_LDAP_SECRET_NAME*"
        ]
    }
```

**Note**  
Your cluster requests will fail if you forget the wildcard `*` character at the end of the secret name when you set Secrets Manager permissions. The wildcard represents the secret versions.  
You should aslo limit the scope of the AWS Secrets Manager policy to only the certificates that your cluster needs to provision instances.

# Create the Amazon EMR security configuration for LDAP integration
<a name="ldap-setup-security"></a>

Before you can launch an EMR cluster with LDAP integration, use the steps in [Create a security configuration with the Amazon EMR console or with the AWS CLI](emr-create-security-configuration.md) to create an Amazon EMR security configuration for the cluster. Complete the following configurations in the `LDAPConfiguration` block under `AuthenticationConfiguration`, or the in corresponding fields in the Amazon EMR console **Security Configurations** section:

**`EnableLDAPAuthentication`**  
Console option: **Authentication protocol: LDAP**  
To use the LDAP integration, set this option to `true` or select it as your authentication protocol when you create a cluster in the console. By default, `EnableLDAPAuthentication` is `true` when you create a security configuration in the Amazon EMR console.

**`LDAPServerURL`**  
Console option: **LDAP server location**  
The location of the LDAP server including the prefix: `ldaps://location_of_server`.

**`BindCertificateARN`**  
Console option: **LDAP SSL certificate**  
The AWS Secrets Manager ARN that contains the certificate to sign the SSL certificate that the LDAP server uses. If your LDAP server is signed by a public Certificate Authority (CA), you can provide an AWS Secrets Manager ARN with a blank file. For more information on how to store your certificate in Secrets Manager, see [Store TLS certificates in AWS Secrets Manager](emr-ranger-tls-certificates.md).

**`BindCredentialsARN`**  
Console option: **LDAP server bind credentials**  
An AWS Secrets Manager ARN that contains the LDAP admin user bind credentials. The credentials are stored as a JSON object. There is only one key-value pair in this secret; the key in the pair is the username, and the value is the password. For example, `{"uid=admin,cn=People,dc=example,dc=com": "AdminPassword1"}`. This is an optional field unless you enable SSH login for your EMR cluster. In many configurations, Active Directory instances require bind credentials to allow SSSD to sync users.

**`LDAPAccessFilter`**  
Console option: **LDAP access filter**  
Specifies the subset of objects within your LDAP server that can authenticate. For example, if all you want to grant access to all users with the `posixAccount` object class in your LDAP server, define the access filter as `(objectClass=posixAccount)`.

**`LDAPUserSearchBase`**  
Console option: **LDAP user search base**  
The search base that your users belong under within your LDAP server. For example, `cn=People,dc=example,dc=com`.

**`LDAPGroupSearchBase`**  
Console option: **LDAP group search base**  
The search base that your groups belong under within your LDAP server. For example, `cn=Groups,dc=example,dc=com`.

**`EnableSSHLogin`**  
Console option: **SSH login**  
Specifies whether or not to allow password authentication with LDAP credentials. We don't recommend that you enable this option. Key pairs are a more secure route to allow access into EMR clusters. This field is optional and defaults to `false`. 

**`LDAPServerType`**  
Console option: **LDAP server type**  
Specifies the type of LDAP server that Amazon EMR connects to. Supported options are Active Directory and OpenLDAP. Other LDAP server types might work, but Amazon EMR doesn't officially support other server types. For more information, see [LDAP components for Amazon EMR](ldap-components.md).

**`ActiveDirectoryConfigurations`**  
A required sub-block for security configurations that use the Active Directory server type.

**`ADDomain`**  
Console option: **Active Directory domain**  
The domain name used to create the User Principal Name (UPN) for user authentication with security configurations that use the Active Directory server type.

## Considerations for security configurations with LDAP and Amazon EMR
<a name="ldap-setup-security-considerations"></a>
+ To create a security configuration with Amazon EMR LDAP integration, you must use in-transit encryption. For information about in-transit encryption, see [Encrypt data at rest and in transit with Amazon EMR](emr-data-encryption.md).
+ You can't define Kerberos configuration in the same security configuration. Amazon EMR provisions a KDC thar is dedicated to the automatically, and manages the admin password for this KDC. Users can't access this admin password.
+ You can't define IAM runtime roles and AWS Lake Formation in the same security configuration.
+ The `LDAPServerURL` must have the `ldaps://` protocol in its value.
+ The `LDAPAccessFilter` can't be empty. 

## Use LDAP with the Apache Ranger integration for Amazon EMR
<a name="ldap-setup-ranger"></a>

With the LDAP integration for Amazon EMR, you can further integrate with Apache Ranger. When you pull .your LDAP users into Ranger, you can then associate those users with an Apache Ranger policy server to integrate with Amazon EMR and other applications. To do this, define the `RangerConfiguration` field within `AuthorizationConfiguration` in the security configuration that you use with your LDAP cluster. For more information on how to set up the security configuration, see [Create the EMR security configuration](emr-ranger-security-config.md).

When you use LDAP with Amazon EMR, you don't need to provide a `KerberosConfiguration` with the Amazon EMR integration for Apache Ranger. 

# Launch an EMR cluster that authenticates with LDAP
<a name="ldap-setup-launch"></a>

Use the following steps to launch an EMR cluster with LDAP or Active Directory. 

1. Set up your environment:
   + Make sure that the nodes on your EMR cluster can communicate with Amazon S3 and AWS Secrets Manager. For more information on how to modify your EC2 instance profile role to communicate with these services, see [Add AWS Secrets Manager permissions to the Amazon EMR instance role](ldap-setup-asm.md).
   + If you plan to run your EMR cluster in a private subnet, you should use AWS PrivateLink and Amazon VPC endpoints, or use network address transalation (NAT) to configure the VPC to communicate with S3 and Secrets Manager. For more information, see [AWS PrivateLink and VPC endpoints](https://docs.aws.amazon.com/vpc/latest/userguide/endpoint-services-overview.html) and [NAT instances](https://docs.aws.amazon.com/vpc/latest/userguide/VPC_NAT_Instance.html) in the *Amazon VPC Getting Started Guide*.
   + Make sure that there is network connectivity between your EMR cluster and the LDAP server. Your EMR clusters must access your LDAP server over the network. The primary, core, and task nodes for the cluster communicate with the LDAP server to sync user data. If your LDAP server runs on Amazon EC2, update the EC2 security group to accept traffic from the EMR cluster. For more information, see [Add AWS Secrets Manager permissions to the Amazon EMR instance role](ldap-setup-asm.md).

1. Create an Amazon EMR security configuration for the LDAP integration. For more information, see [Create the Amazon EMR security configuration for LDAP integration](ldap-setup-security.md).

1. Now that you're set up, use the steps in [Launch an Amazon EMR cluster](emr-gs.md#emr-getting-started-launch-sample-cluster) to launch your cluster with the following configurations:
   + Select Amazon EMR release 6.12 or higher. We recommend that you use the latest Amazon EMR release.
   + Only specify or select applications for your cluster that support LDAP. For a list of LDAP-supported applications with Amazon EMR, see [Application support and considerations with LDAP for Amazon EMR](ldap-considerations.md).
   + Apply the security configuration that you created in the previous step.