# Development endpoints
Dev endpoints

**Note**  
 **The console experience for dev endpoints has been removed as of March 31, 2023.** Creating, updating, and monitoring dev endpoints is still available via the [Development endpoints API](aws-glue-api-dev-endpoint.md) and [ AWS Glue CLI](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/glue/index.html#cli-aws-glue).

 We strongly recommend migrating from dev endpoints to interactive sessions for the reasons listed below. For required actions on how to migrate from dev endpoints to interactive sessions, see [Migrating from dev endpoints to interactive sessions](https://docs.aws.amazon.com/glue/latest/dg/development-migration-checklist.html). 


| Description | Dev endpoints | Interactive sessions | 
| --- | --- | --- | 
| Glue version support | Supports AWS Glue version 0.9 and 1.0 | Supports AWS Glue version 2.0 and later | 
| Dev endpoints are not available in the Asia Pacific (Jakarta) (ap-southeast-3), Middle East (UAE) (me-central-1), Europe (Spain) (eu-south-2), Europe (Zurich) (eu-central-2), or other new regions going forward | Interactive sessions are not currently available in the Middle East (UAE) (me-central-1) region, but may be made available later | 
| Access method to the Spark cluster | Supports SSH, REPL shell, Jupyter notebook, IDE (e.g. PyCharm) | supports AWS Glue Studio notebook, Jupyter notebook, various IDEs (for example, Visual Studio Code, PyCharm), and SageMaker AI notebook | 
| Time to first query | Requires 10-15 minutes to setup a Spark cluster | Can take up to 1 minute to set up an ephemeral Spark cluster | 
| Price model | AWS charges for development endpoints based on the time that the endpoint is provisioned and the number of DPUs. Development endpoints do not time out. There is a 10-minute minimum billing duration for each provisioned development endpoint. Additionally, AWS charges for Jupyter notebook on Amazon EC2 instances, and SageMaker AI notebooks when you configure them with dev endpoints.  | AWS charges for interactive sessions based on the time that the session is active and the number of DPUs. interactive sessions have configurable idle timeouts.  AWS Glue Studio notebooks provide a built-in interface for interactive sessions and are offered at no additional cost. There is a 1-minute minimum billing duration for each interactive session. AWS Glue Studio notebooks provide a built-in interface for interactive sessions and are offered at no additional cost | 
| Console experience | Only available via the CLI and API | Available through the AWS Glue console, CLI, and APIs | 

# Migrating from dev endpoints to interactive sessions


 Use the following checklist to determine the appropriate method to migrate from dev endpoints to interactive sessions. 

 **Does your script depend on AWS Glue 0.9 or 1.0 specific features (for example, HDFS, YARN, etc.)?** 

 If the answer is yes, see [Migrating AWS Glue jobs to  AWS Glue version 3.0](https://docs.aws.amazon.com/glue/latest/dg/migrating-version-30.html). to learn how to migrate from Glue 0.9 or 1.0 to Glue 3.0 and later. 

 **Which method do you use to access your dev endpoint?** 


| If you use this method | Then do this | 
| --- | --- | 
| SageMaker AI notebook, Jupyter notebook, or JupyterLab | Migrate to [AWS Glue Studio notebook](https://docs.aws.amazon.com/glue/latest/dg/interactive-sessions-gs-notebook.html) by downloading .ipynb files on Jupyter and create a new AWS Glue Studio notebook job by uploading the  .ipynb file. Alternatively, you can also use [ SageMaker AI Studio](https://aws.amazon.com/blogs/machine-learning/prepare-data-at-scale-in-amazon-sagemaker-studio-using-serverless-aws-glue-interactive-sessions/) and select the AWS Glue kernel.  | 
| Zeppelin notebook | Convert the notebook to a Jupyter notebook manually by copying and pasting code or automatically using a third-party converter such as ze2nb. Then, use the notebook in AWS Glue Studio notebook or SageMaker AI Studio.  | 
| IDE |  See [ Author AWS Glue jobs with PyCharm using AWS Glue interactive sessions](https://aws.amazon.com/blogs/big-data/author-aws-glue-jobs-with-pycharm-using-aws-glue-interactive-sessions/), or [ Using interactive sessions with Microsoft Visual Studio Code](https://docs.aws.amazon.com/glue/latest/dg/interactive-sessions-vscode.html).  | 
| REPL |   Install the [https://docs.aws.amazon.com/glue/latest/dg/interactive-sessions.html](https://docs.aws.amazon.com/glue/latest/dg/interactive-sessions.html) locally, then run the following command:  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/development-migration-checklist.html)  | 
| SSH | No corresponding option on interactive sessions. Alternatively, you can use a Docker image. To learn more, see [Developing using a Docker image](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-libraries.html#develop-local-docker-image).  | 

The following sections provide information on using dev endpoints to develop jobs in AWS Glue version 1.0.

**Topics**
+ [

# Migrating from dev endpoints to interactive sessions
](development-migration-checklist.md)
+ [

# Developing scripts using development endpoints
](dev-endpoint.md)
+ [

# Managing notebooks
](notebooks-with-glue.md)

# Developing scripts using development endpoints


**Note**  
 Development Endpoints are only supported for versions of AWS Glue prior to 2.0. For an interactive environment where you can author and test ETL scripts, use [Notebooks on AWS Glue Studio](https://docs.aws.amazon.com/glue/latest/ug/notebooks-chapter.html). 

AWS Glue can create an environment—known as a *development endpoint*—that you can use to iteratively develop and test your extract, transform, and load (ETL) scripts. You can create, edit, and delete development endpoints using the  AWS Glue console or API.

## Managing your development environment
Development environment

When you create a development endpoint, you provide configuration values to provision the development environment. These values tell AWS Glue how to set up the network so that you can access the endpoint securely and the endpoint can access your data stores.

You can then create a notebook that connects to the endpoint, and use your notebook to author and test your ETL script. When you're satisfied with the results of your development process, you can create an ETL job that runs your script. With this process, you can add functions and debug your scripts in an interactive manner.

Follow the tutorials in this section to learn how to use your development endpoint with notebooks.

**Topics**
+ [

## Managing your development environment
](#dev-endpoint-managing-dev-environment)
+ [

# Development endpoint workflow
](dev-endpoint-workflow.md)
+ [

# How AWS Glue development endpoints work with SageMaker notebooks
](dev-endpoint-how-it-works.md)
+ [

# Adding a development endpoint
](add-dev-endpoint.md)
+ [

# Accessing your development endpoint
](dev-endpoint-elastic-ip.md)
+ [

# Tutorial: Set up a Jupyter notebook in JupyterLab to test and debug ETL scripts
](dev-endpoint-tutorial-local-jupyter.md)
+ [

# Tutorial: Use a SageMaker AI notebook with your development endpoint
](dev-endpoint-tutorial-sage.md)
+ [

# Tutorial: Use a REPL shell with your development endpoint
](dev-endpoint-tutorial-repl.md)
+ [

# Tutorial: Set up PyCharm professional with a development endpoint
](dev-endpoint-tutorial-pycharm.md)
+ [

# Advanced configuration: sharing development endpoints among multiple users
](dev-endpoint-sharing.md)

# Development endpoint workflow


To use an AWS Glue development endpoint, you can follow this workflow:

1. Create a development endpoint using the API. The endpoint is launched in a virtual private cloud (VPC) with your defined security groups.

1. The API polls the development endpoint until it is provisioned and ready for work. When it's ready, connect to the development endpoint using one of the following methods to create and test AWS Glue scripts.
   + Create an SageMaker AI notebook in your account. For more information about how to create a notebook, see [Authoring code with AWS Glue Studio notebooks](notebooks-chapter.md).
   + Open a terminal window to connect directly to a development endpoint.
   + If you have the professional edition of the JetBrains [PyCharm Python IDE](https://www.jetbrains.com/pycharm/), connect it to a development endpoint and use it to develop interactively. If you insert `pydevd` statements in your script, PyCharm can support remote breakpoints.

1. When you finish debugging and testing on your development endpoint, you can delete it.

# How AWS Glue development endpoints work with SageMaker notebooks
How Development Endpoints Work with SageMaker Notebooks

One of the common ways to access your development endpoints is to use [Jupyter](https://jupyter.org/) on SageMaker notebooks. The Jupyter notebook is an open-source web application which is widely used in visualization, analytics, machine learning, etc. An AWS Glue SageMaker notebook provides you a Jupyter notebook experience with AWS Glue development endpoints. In the AWS Glue SageMaker notebook, the Jupyter notebook environment is pre-configured with [SparkMagic](https://github.com/jupyter-incubator/sparkmagic), an open source Jupyter plugin to submit Spark jobs to a remote Spark cluster. [Apache Livy](https://livy.apache.org) is a service that allows interaction with a remote Spark cluster over a REST API. In the AWS Glue SageMaker notebook, SparkMagic is configured to call the REST API against a Livy server running on an AWS Glue development endpoint. 

The following text flow explains how each component works:

 *AWS Glue SageMaker notebook: (Jupyter → SparkMagic) → (network) →  AWS Glue development endpoint: (Apache Livy → Apache Spark)* 

Once you run your Spark script written in each paragraph on a Jupyter notebook, the Spark code is submitted to the Livy server via SparkMagic, then a Spark job named "livy-session-N" runs on the Spark cluster. This job is called a Livy session. The Spark job will run while the notebook session is alive. The Spark job will be terminated when you shutdown the Jupyter kernel from the notebook, or when the session is timed out. One Spark job is launched per notebook (.ipynb) file.

You can use a single AWS Glue development endpoint with multiple SageMaker notebook instances. You can create multiple notebook files in each SageMaker notebook instance. When you open an each notebook file and run the paragraphs, then a Livy session is launched per notebook file on the Spark cluster via SparkMagic. Each Livy session corresponds to single Spark job.

## Default behavior for AWS Glue development endpoints and SageMaker notebooks
Default Behavior for Endpoints and Notebooks

The Spark jobs run based on the [Spark configuration](https://spark.apache.org/docs/2.4.3/configuration.html). There are multiple ways to set the Spark configuration (for example, Spark cluster configuration, SparkMagic's configuration, etc.).

By default, Spark allocates cluster resources to a Livy session based on the Spark cluster configuration. In the AWS Glue development endpoints, the cluster configuration depends on the worker type. Here's a table which explains the common configurations per worker type.


****  

|  | Standard | G.1X | G.2X | 
| --- | --- | --- | --- | 
|  spark.driver.memory  | 5G | 10G | 20G | 
|  spark.executor.memory  | 5G | 10G | 20G | 
|  spark.executor.cores  | 4 | 8 | 16 | 
|  spark.dynamicAllocation.enabled  | TRUE | TRUE | TRUE | 

The maximum number of Spark executors is automatically calculated by combination of DPU (or `NumberOfWorkers`) and worker type. 


****  

|  | Standard | G.1X | G.2X | 
| --- | --- | --- | --- | 
| The number of max Spark executors |  (DPU - 1) \$1 2 - 1  |  (NumberOfWorkers - 1)   |  (NumberOfWorkers - 1)   | 

For example, if your development endpoint has 10 workers and the worker type is ` G.1X`, then you will have 9 Spark executors and the entire cluster will have 90G of executor memory since each executor will have 10G of memory.

Regardless of the specified worker type, Spark dynamic resource allocation will be turned on. If a dataset is large enough, Spark may allocate all the executors to a single Livy session since `spark.dynamicAllocation.maxExecutors` is not set by default. This means that other Livy sessions on the same dev endpoint will wait to launch new executors. If the dataset is small, Spark will be able to allocate executors to multiple Livy sessions at the same time.

**Note**  
For more information about how resources are allocated in different use cases and how you set a configuration to modify the behavior, see [Advanced configuration: sharing development endpoints among multiple users](dev-endpoint-sharing.md).

# Adding a development endpoint


Use development endpoints to iteratively develop and test your extract, transform, and load (ETL) scripts in AWS Glue. Working with development endpoints is only available through the AWS Command Line Interface.

1. In a command line window, enter a command similar to the following.

   ```
   aws glue create-dev-endpoint --endpoint-name "endpoint1" --role-arn "arn:aws:iam::account-id:role/role-name" --number-of-nodes "3" --glue-version "1.0" --arguments '{"GLUE_PYTHON_VERSION": "3"}' --region "region-name"
   ```

   This command specifies AWS Glue version 1.0. Because this version supports both Python 2 and Python 3, you can use the `arguments` parameter to indicate the desired Python version. If the `glue-version` parameter is omitted, AWS Glue version 0.9 is assumed. For more information about AWS Glue versions, see the [Glue version job property](add-job.md#glue-version-table).

   For information about additional command line parameters, see [create-dev-endpoint](https://docs.aws.amazon.com/cli/latest/reference/glue/create-dev-endpoint.html) in the *AWS CLI Command Reference*.

1. (Optional) Enter the following command to check the development endpoint status. When the status changes to `READY`, the development endpoint is ready to use.

   ```
   aws glue get-dev-endpoint --endpoint-name "endpoint1"
   ```

# Accessing your development endpoint


When you create a development endpoint in a virtual private cloud (VPC), AWS Glue returns only a private IP address. The public IP address field is not populated. When you create a non-VPC development endpoint, AWS Glue returns only a public IP address.

If your development endpoint has a **Public address**, confirm that it is reachable with the SSH private key for the development endpoint, as in the following example.

```
ssh -i dev-endpoint-private-key.pem glue@public-address
```

Suppose that your development endpoint has a **Private address**, your VPC subnet is routable from the public internet, and its security groups allow inbound access from your client. In this case, follow these steps to attach an *Elastic IP address* to a development endpoint to allow access from the internet.

**Note**  
If you want to use Elastic IP addresses, the subnet that is being used requires an internet gateway associated through the route table.

**To access a development endpoint by attaching an Elastic IP address**

1. Open the AWS Glue console at [https://console.aws.amazon.com/glue/](https://console.aws.amazon.com/glue/).

1. In the navigation pane, choose **Dev endpoints**, and navigate to the development endpoint details page. Record the **Private address** for use in the next step. 

1. Open the Amazon EC2 console at [https://console.aws.amazon.com/ec2/](https://console.aws.amazon.com/ec2/).

1. In the navigation pane, under **Network & Security**, choose **Network Interfaces**. 

1. Search for the **Private DNS (IPv4)** that corresponds to the **Private address** on the AWS Glue console development endpoint details page. 

   You might need to modify which columns are displayed on your Amazon EC2 console. Note the **Network interface ID** (ENI) for this address (for example, `eni-12345678`).

1. On the Amazon EC2 console, under **Network & Security**, choose **Elastic IPs**. 

1. Choose **Allocate new address**, and then choose **Allocate** to allocate a new Elastic IP address.

1. On the **Elastic IPs** page, choose the newly allocated **Elastic IP**. Then choose **Actions**, **Associate address**.

1. On the **Associate address** page, do the following:
   + For **Resource type**, choose **Network interface**.
   + In the **Network interface** box, enter the **Network interface ID** (ENI) for the private address.
   + Choose **Associate**.

1. Confirm that the newly associated Elastic IP address is reachable with the SSH private key that is associated with the development endpoint, as in the following example. 

   ```
   ssh -i dev-endpoint-private-key.pem glue@elastic-ip
   ```

   For information about using a bastion host to get SSH access to the development endpoint’s private address, see the AWS Security Blog post [Securely Connect to Linux Instances Running in a Private Amazon VPC](https://aws.amazon.com/blogs/security/securely-connect-to-linux-instances-running-in-a-private-amazon-vpc/).

# Tutorial: Set up a Jupyter notebook in JupyterLab to test and debug ETL scripts
Tutorial: Jupyter notebook in JupyterLab

In this tutorial, you connect a Jupyter notebook in JupyterLab running on your local machine to a development endpoint. You do this so that you can interactively run, debug, and test AWS Glue extract, transform, and load (ETL) scripts before deploying them. This tutorial uses Secure Shell (SSH) port forwarding to connect your local machine to an AWS Glue development endpoint. For more information, see [Port forwarding](https://en.wikipedia.org/wiki/Port_forwarding) on Wikipedia.

## Step 1: Install JupyterLab and Sparkmagic
Installing JupyterLab and Sparkmagic

You can install JupyterLab by using `conda` or `pip`. `conda` is an open-source package management system and environment management system that runs on Windows, macOS, and Linux. `pip` is the package installer for Python.

If you're installing on macOS, you must have Xcode installed before you can install Sparkmagic.

1. Install JupyterLab, Sparkmagic, and the related extensions.

   ```
   $ conda install -c conda-forge jupyterlab
   $ pip install sparkmagic
   $ jupyter nbextension enable --py --sys-prefix widgetsnbextension
   $ jupyter labextension install @jupyter-widgets/jupyterlab-manager
   ```

1. Check the `sparkmagic` directory from `Location`. 

   ```
   $ pip show sparkmagic | grep Location
   Location: /Users/username/.pyenv/versions/anaconda3-5.3.1/lib/python3.7/site-packages
   ```

1. Change your directory to the one returned for `Location`, and install the kernels for Scala and PySpark.

   ```
   $ cd /Users/username/.pyenv/versions/anaconda3-5.3.1/lib/python3.7/site-packages
   $ jupyter-kernelspec install sparkmagic/kernels/sparkkernel
   $ jupyter-kernelspec install sparkmagic/kernels/pysparkkernel
   ```

1. Download a sample `config` file. 

   ```
   $ curl -o ~/.sparkmagic/config.json https://raw.githubusercontent.com/jupyter-incubator/sparkmagic/master/sparkmagic/example_config.json
   ```

   In this configuration file, you can configure Spark-related parameters like `driverMemory` and `executorCores`.

## Step 2: Start JupyterLab


When you start JupyterLab, your default web browser is automatically opened, and the URL `http://localhost:8888/lab/workspaces/{workspace_name}` is shown.

```
$ jupyter lab
```

## Step 3: Initiate SSH port forwarding to connect to your development endpoint
Port forwarding

Next, use SSH local port forwarding to forward a local port (here, `8998`) to the remote destination that is defined by AWS Glue (`169.254.76.1:8998`). 

1. Open a separate terminal window that gives you access to SSH. In Microsoft Windows, you can use the BASH shell provided by [Git for Windows](https://git-scm.com/downloads), or you can install [Cygwin](https://www.cygwin.com/).

1. Run the following SSH command, modified as follows:
   + Replace `private-key-file-path` with a path to the `.pem` file that contains the private key corresponding to the public key that you used to create your development endpoint.
   + If you're forwarding a different port than `8998`, replace `8998` with the port number that you're actually using locally. The address `169.254.76.1:8998` is the remote port and isn't changed by you.
   + Replace `dev-endpoint-public-dns` with the public DNS address of your development endpoint. To find this address, navigate to your development endpoint in the AWS Glue console, choose the name, and copy the **Public address** that's listed on the **Endpoint details** page.

   ```
   ssh -i private-key-file-path -NTL 8998:169.254.76.1:8998 glue@dev-endpoint-public-dns
   ```

   You will likely see a warning message like the following:

   ```
   The authenticity of host 'ec2-xx-xxx-xxx-xx.us-west-2.compute.amazonaws.com (xx.xxx.xxx.xx)'
   can't be established.  ECDSA key fingerprint is SHA256:4e97875Brt+1wKzRko+JflSnp21X7aTP3BcFnHYLEts.
   Are you sure you want to continue connecting (yes/no)?
   ```

   Enter **yes** and leave the terminal window open while you use JupyterLab. 

1. Check that SSH port forwarding is working with the development endpoint correctly.

   ```
   $ curl localhost:8998/sessions
   {"from":0,"total":0,"sessions":[]}
   ```

## Step 4: Run a simple script fragment in a notebook paragraph
Running a sample script

Now your notebook in JupyterLab should work with your development endpoint. Enter the following script fragment into your notebook and run it.

1. Check that Spark is running successfully. The following command instructs Spark to calculate `1` and then print the value.

   ```
   spark.sql("select 1").show()
   ```

1. Check if AWS Glue Data Catalog integration is working. The following command lists the tables in the Data Catalog.

   ```
   spark.sql("show tables").show()
   ```

1. Check that a simple script fragment that uses AWS Glue libraries works.

   The following script uses the `persons_json` table metadata in the AWS Glue Data Catalog to create a `DynamicFrame` from your sample data. It then prints out the item count and the schema of this data. 

```
import sys
from pyspark.context import SparkContext
from awsglue.context import GlueContext
 
# Create a Glue context
glueContext = GlueContext(SparkContext.getOrCreate())
 
# Create a DynamicFrame using the 'persons_json' table
persons_DyF = glueContext.create_dynamic_frame.from_catalog(database="legislators", table_name="persons_json")
 
# Print out information about *this* data
print("Count:  ", persons_DyF.count())
persons_DyF.printSchema()
```

The output of the script is as follows.

```
 Count:  1961
 root
 |-- family_name: string
 |-- name: string
 |-- links: array
 |    |-- element: struct
 |    |    |-- note: string
 |    |    |-- url: string
 |-- gender: string
 |-- image: string
 |-- identifiers: array
 |    |-- element: struct
 |    |    |-- scheme: string
 |    |    |-- identifier: string
 |-- other_names: array
 |    |-- element: struct
 |    |    |-- note: string
 |    |    |-- name: string
 |    |    |-- lang: string
 |-- sort_name: string
 |-- images: array
 |    |-- element: struct
 |    |    |-- url: string
 |-- given_name: string
 |-- birth_date: string
 |-- id: string
 |-- contact_details: array
 |    |-- element: struct
 |    |    |-- type: string
 |    |    |-- value: string
 |-- death_date: string
```

## Troubleshooting
Troubleshooting
+ During the installation of JupyterLab, if your computer is behind a corporate proxy or firewall, you might encounter HTTP and SSL errors due to custom security profiles managed by corporate IT departments.

  The following is an example of a typical error that occurs when `conda` can't connect to its own repositories:

  ```
  CondaHTTPError: HTTP 000 CONNECTION FAILED for url <https://repo.anaconda.com/pkgs/main/win-64/current_repodata.json>
  ```

  This might happen because your company can block connections to widely used repositories in Python and JavaScript communities. For more information, see [Installation Problems](https://jupyterlab.readthedocs.io/en/stable/getting_started/installation.html#installation-problems) on the JupyterLab website.
+ If you encounter a *connection refused* error when trying to connect to your development endpoint, you might be using a development endpoint that is out of date. Try creating a new development endpoint and reconnecting.

# Tutorial: Use a SageMaker AI notebook with your development endpoint
Tutorial: Use a SageMaker AI notebook

 In AWS Glue, you can create a development endpoint and then create a SageMaker AI notebook to help develop your ETL and machine learning scripts. A SageMaker AI notebook is a fully managed machine learning compute instance running the Jupyter Notebook application.

1. In the AWS Glue console, choose **Dev endpoints** to navigate to the development endpoints list. 

1. Select the check box next to the name of a development endpoint that you want to use, and on the **Action** menu, choose **Create SageMaker notebook**.

1. Fill out the **Create and configure a notebook** page as follows:

   1. Enter a notebook name.

   1. Under **Attach to development endpoint**, verify the development endpoint.

   1. Create or choose an AWS Identity and Access Management (IAM) role.

      Creating a role is recommended. If you use an existing role, ensure that it has the required permissions. For more information, see [Step 6: Create an IAM policy for SageMaker AI notebooks](create-sagemaker-notebook-policy.md).

   1. (Optional) Choose a VPC, a subnet, and one or more security groups.

   1. (Optional) Choose an AWS Key Management Service encryption key.

   1. (Optional) Add tags for the notebook instance.

1. Choose **Create notebook**. On the **Notebooks** page, choose the refresh icon at the upper right, and continue until the **Status** shows `Ready`.

1. Select the check box next to the new notebook name, and then choose **Open notebook**.

1. Create a new notebook: On the **jupyter** page, choose **New**, and then choose **Sparkmagic (PySpark)**.

   Your screen should now look like the following:  
![\[The jupyter page has a menu bar, toolbar, and a wide text field into which you can enter statements.\]](http://docs.aws.amazon.com/glue/latest/dg/images/sagemaker-notebook.png)

1. (Optional) At the top of the page, choose **Untitled**, and give the notebook a name.

1. To start a Spark application, enter the following command into the notebook, and then in the toolbar, choose **Run**.

   ```
   spark
   ```

   After a short delay, you should see the following response:  
![\[The system response shows Spark application status and outputs the following message: SparkSession available as 'spark'.\]](http://docs.aws.amazon.com/glue/latest/dg/images/spark-command-response.png)

1. Create a dynamic frame and run a query against it: Copy, paste, and run the following code, which outputs the count and schema of the `persons_json` table.

   ```
   import sys
   from pyspark.context import SparkContext
   from awsglue.context import GlueContext
   from awsglue.transforms import *
   glueContext = GlueContext(SparkContext.getOrCreate())
   persons_DyF = glueContext.create_dynamic_frame.from_catalog(database="legislators", table_name="persons_json")
   print ("Count:  ", persons_DyF.count())
   persons_DyF.printSchema()
   ```

# Tutorial: Use a REPL shell with your development endpoint
Tutorial: Use a REPL shell

 In AWS Glue, you can create a development endpoint and then invoke a REPL (Read–Evaluate–Print Loop) shell to run PySpark code incrementally so that you can interactively debug your ETL scripts before deploying them.

 In order to use a REPL on a development endpoint, you need to have authorization to SSH to the endpoint. 

1. On your local computer, open a terminal window that can run SSH commands, and paste in the edited SSH command. Run the command.

   Assuming that you accepted AWS Glue version 1.0 with Python 3 for the development endpoint, the output will look like this:

   ```
   Python 3.6.8 (default, Aug  2 2019, 17:42:44)
   [GCC 4.8.5 20150623 (Red Hat 4.8.5-28)] on linux
   Type "help", "copyright", "credits" or "license" for more information.
   SLF4J: Class path contains multiple SLF4J bindings.
   SLF4J: Found binding in [jar:file:/usr/share/aws/glue/etl/jars/glue-assembly.jar!/org/slf4j/impl/StaticLoggerBinder.class]
   SLF4J: Found binding in [jar:file:/usr/lib/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
   SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
   SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
   Setting default log level to "WARN".
   To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
   2019-09-23 22:12:23,071 WARN  [Thread-5] yarn.Client (Logging.scala:logWarning(66)) - Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
   2019-09-23 22:12:26,562 WARN  [Thread-5] yarn.Client (Logging.scala:logWarning(66)) - Same name resource file:/usr/lib/spark/python/lib/pyspark.zip added multiple times to distributed cache
   2019-09-23 22:12:26,580 WARN  [Thread-5] yarn.Client (Logging.scala:logWarning(66)) - Same path resource file:///usr/share/aws/glue/etl/python/PyGlue.zip added multiple times to distributed cache.
   2019-09-23 22:12:26,581 WARN  [Thread-5] yarn.Client (Logging.scala:logWarning(66)) - Same path resource file:///usr/lib/spark/python/lib/py4j-src.zip added multiple times to distributed cache.
   2019-09-23 22:12:26,581 WARN  [Thread-5] yarn.Client (Logging.scala:logWarning(66)) - Same path resource file:///usr/share/aws/glue/libs/pyspark.zip added multiple times to distributed cache.
   Welcome to
         ____              __
        / __/__  ___ _____/ /__
       _\ \/ _ \/ _ `/ __/  '_/
      /__ / .__/\_,_/_/ /_/\_\   version 2.4.3
         /_/
   
   Using Python version 3.6.8 (default, Aug  2 2019 17:42:44)
   SparkSession available as 'spark'.
   >>>
   ```

1. Test that the REPL shell is working correctly by typing the statement, `print(spark.version)`. As long as that displays the Spark version, your REPL is now ready to use.

1. Now you can try executing the following simple script, line by line, in the shell:

   ```
   import sys
   from pyspark.context import SparkContext
   from awsglue.context import GlueContext
   from awsglue.transforms import *
   glueContext = GlueContext(SparkContext.getOrCreate())
   persons_DyF = glueContext.create_dynamic_frame.from_catalog(database="legislators", table_name="persons_json")
   print ("Count:  ", persons_DyF.count())
   persons_DyF.printSchema()
   ```

# Tutorial: Set up PyCharm professional with a development endpoint
Tutorial: Use PyCharm professional

This tutorial shows you how to connect the [PyCharm Professional](https://www.jetbrains.com/pycharm/) Python IDE running on your local machine to a development endpoint so that you can interactively run, debug, and test AWS Glue ETL (extract, transfer, and load) scripts before deploying them. The instructions and screen captures in the tutorial are based on PyCharm Professional version 2019.3.

To connect to a development endpoint interactively, you must have PyCharm Professional installed. You can't do this using the free edition.

**Note**  
The tutorial uses Amazon S3 as a data source. If you want to use a JDBC data source instead, you must run your development endpoint in a virtual private cloud (VPC). To connect with SSH to a development endpoint in a VPC, you must create an SSH tunnel. This tutorial does not include instructions for creating an SSH tunnel. For information on using SSH to connect to a development endpoint in a VPC, see [Securely Connect to Linux Instances Running in a Private Amazon VPC](https://aws.amazon.com/blogs/security/securely-connect-to-linux-instances-running-in-a-private-amazon-vpc/) in the AWS security blog.

**Topics**
+ [

## Connecting PyCharm professional to a development endpoint
](#dev-endpoint-tutorial-pycharm-connect)
+ [

## Deploying the script to your development endpoint
](#dev-endpoint-tutorial-pycharm-deploy)
+ [

## Configuring a remote interpreter
](#dev-endpoint-tutorial-pycharm-interpreter)
+ [

## Running your script on the development endpoint
](#dev-endpoint-tutorial-pycharm-debug-run)

## Connecting PyCharm professional to a development endpoint
Connecting PyCharm

1. Create a new pure-Python project in PyCharm named `legislators`.

1. Create a file named `get_person_schema.py` in the project with the following content:

   ```
   from pyspark.context import SparkContext
   from awsglue.context import GlueContext
   
   
   def main():
       # Create a Glue context
       glueContext = GlueContext(SparkContext.getOrCreate())
   
       # Create a DynamicFrame using the 'persons_json' table
       persons_DyF = glueContext.create_dynamic_frame.from_catalog(database="legislators", table_name="persons_json")
   
       # Print out information about this data
       print("Count:  ", persons_DyF.count())
       persons_DyF.printSchema()
   
   
   if __name__ == "__main__":
       main()
   ```

1. Do one of the following:
   + For AWS Glue version 0.9, download the AWS Glue Python library file, `PyGlue.zip`, from `https://s3.amazonaws.com/aws-glue-jes-prod-us-east-1-assets/etl/python/PyGlue.zip` to a convenient location on your local machine.
   + For AWS Glue version 1.0 and later, download the AWS Glue Python library file, `PyGlue.zip`, from `https://s3.amazonaws.com/aws-glue-jes-prod-us-east-1-assets/etl-1.0/python/PyGlue.zip` to a convenient location on your local machine.

1. Add `PyGlue.zip` as a content root for your project in PyCharm:
   + In PyCharm, choose **File**, **Settings** to open the **Settings** dialog box. (You can also press `Ctrl+Alt+S`.)
   + Expand the `legislators` project and choose **Project Structure**. Then in the right pane, choose **\$1 Add Content Root**.
   + Navigate to the location where you saved `PyGlue.zip`, select it, then choose **Apply**.

    The **Settings** screen should look something like the following:  
![\[The PyCharm Settings screen with PyGlue.zip added as a content root.\]](http://docs.aws.amazon.com/glue/latest/dg/images/PyCharm_AddContentRoot.png)

   Leave the **Settings** dialog box open after you choose **Apply**.

1. Configure deployment options to upload the local script to your development endpoint using SFTP (this capability is available only in PyCharm Professional):
   + In the **Settings** dialog box, expand the **Build, Execution, Deployment** section. Choose the **Deployment** subsection.
   + Choose the **\$1** icon at the top of the middle pane to add a new server. Set its **Type** to `SFTP` and give it a name.
   + Set the **SFTP host** to the **Public address** of your development endpoint, as listed on its details page. (Choose the name of your development endpoint in the AWS Glue console to display the details page). For a development endpoint running in a VPC, set **SFTP host** to the host address and local port of your SSH tunnel to the development endpoint.
   + Set the **User name** to `glue`.
   + Set the **Auth type** to **Key pair (OpenSSH or Putty)**. Set the **Private key file** by browsing to the location where your development endpoint's private key file is located. Note that PyCharm only supports DSA, RSA and ECDSA OpenSSH key types, and does not accept keys in Putty's private format. You can use an up-to-date version of `ssh-keygen` to generate a key-pair type that PyCharm accepts, using syntax like the following:

     ```
     ssh-keygen -t rsa -f <key_file_name> -C "<your_email_address>"
     ```
   + Choose **Test connection**, and allow the connection to be tested. If the connection succeeds, choose **Apply**.

    The **Settings** screen should now look something like the following:  
![\[The PyCharm Settings screen with an SFTP server defined.\]](http://docs.aws.amazon.com/glue/latest/dg/images/PyCharm_SFTP.png)

   Again, leave the **Settings** dialog box open after you choose **Apply**.

1. Map the local directory to a remote directory for deployment:
   + In the right pane of the **Deployment** page, choose the middle tab at the top, labeled **Mappings**.
   + In the **Deployment Path** column, enter a path under `/home/glue/scripts/` for deployment of your project path. For example: `/home/glue/scripts/legislators`.
   + Choose **Apply**.

    The **Settings** screen should now look something like the following:  
![\[The PyCharm Settings screen after a deployment mapping.\]](http://docs.aws.amazon.com/glue/latest/dg/images/PyCharm_Mapping.png)

   Choose **OK** to close the **Settings** dialog box.

## Deploying the script to your development endpoint
Deployment

1. Choose **Tools**, **Deployment**, and then choose the name under which you set up your development endpoint, as shown in the following image:  
![\[The menu item for deploying your script.\]](http://docs.aws.amazon.com/glue/latest/dg/images/PyCharm_Deploy.png)

   After your script has been deployed, the bottom of the screen should look something like the following:  
![\[The bottom of the PyCharm screen after a successful deployment.\]](http://docs.aws.amazon.com/glue/latest/dg/images/PyCharm_Deployed.png)

1. On the menu bar, choose **Tools**, **Deployment**, **Automatic Upload (always)**. Ensure that a check mark appears next to **Automatic Upload (always)**.

   When this option is enabled, PyCharm automatically uploads changed files to the development endpoint.

## Configuring a remote interpreter


Configure PyCharm to use the Python interpreter on the development endpoint.

1. From the **File** menu, choose **Settings**.

1. Expand the project **legislators** and choose **Project Interpreter**.

1. Choose the gear icon next to the **Project Interpreter** list, and then choose **Add**.

1. In the **Add Python Interpreter** dialog box, in the left pane, choose **SSH Interpreter**.

1. Choose **Existing server configuration**, and in the **Deployment configuration** list, choose your configuration.

   Your screen should look something like the following image.  
![\[In the left pane, SSH Interpreter is selected, and in the right pane, the Existing server configuration radio button is selected. The Deployment configuration field contains the configuration name and the message "Remote SDK is saved in IDE settings, so it needs the deployment server to be saved there too. Which do you prefer?" The following are the choices beneath that message: "Create copy of this deployment server in IDE settings" and "Move this server to IDE settings."\]](http://docs.aws.amazon.com/glue/latest/dg/images/PyCharm_Interpreter1.png)

1. Choose **Move this server to IDE settings**, and then choose **Next**.

1. In the **Interpreter** field, change the path to` /usr/bin/gluepython` if you are using Python 2, or to `/usr/bin/gluepython3` if you are using Python 3. Then choose **Finish**.

## Running your script on the development endpoint
Running the script

To run the script:
+ In the left pane, right-click the file name and choose **Run '*<filename>*'**.

  After a series of messages, the final output should show the count and the schema.

  ```
  Count:   1961
  root
  |-- family_name: string
  |-- name: string
  |-- links: array
  |    |-- element: struct
  |    |    |-- note: string
  |    |    |-- url: string
  |-- gender: string
  |-- image: string
  |-- identifiers: array
  |    |-- element: struct
  |    |    |-- scheme: string
  |    |    |-- identifier: string
  |-- other_names: array
  |    |-- element: struct
  |    |    |-- lang: string
  |    |    |-- note: string
  |    |    |-- name: string
  |-- sort_name: string
  |-- images: array
  |    |-- element: struct
  |    |    |-- url: string
  |-- given_name: string
  |-- birth_date: string
  |-- id: string
  |-- contact_details: array
  |    |-- element: struct
  |    |    |-- type: string
  |    |    |-- value: string
  |-- death_date: string
  
  
  Process finished with exit code 0
  ```

You are now set up to debug your script remotely on your development endpoint.

# Advanced configuration: sharing development endpoints among multiple users
Advanced configuration: sharing dev endpoints among multiple users

This section explains how you can take advantage of development endpoints with SageMaker notebooks in typical use cases to share development endpoints among multiple users.

## Single-tenancy configuration


In single tenant use-cases, to simplify the developer experience and to avoid contention for resources it is recommended that you have each developer use their own development endpoint sized for the project they are working on. This also simplifies the decisions related to worker type and DPU count leaving them up to the discretion of the developer and project they are working on. 

You won't need to take care of resource allocation unless you runs multiple notebook files concurrently. If you run code in multiple notebook files at the same time, multiple Livy sessions will be launched concurrently. To segregate Spark cluster configurations in order to run multiple Livy sessions at the same time, you can follow the steps which are introduced in multi tenant use-cases.

For example, if your development endpoint has 10 workers and the worker type is ` G.1X`, then you will have 9 Spark executors and the entire cluster will have 90G of executor memory since each executor will have 10G of memory.

Regardless of the specified worker type, Spark dynamic resource allocation will be turned on. If a dataset is large enough, Spark may allocate all the executors to a single Livy session since `spark.dynamicAllocation.maxExecutors` is not set by default. This means that other Livy sessions on the same dev endpoint will wait to launch new executors. If the dataset is small, Spark will be able to allocate executors to multiple Livy sessions at the same time.

**Note**  
For more information about how resources are allocated in different use cases and how you set a configuration to modify the behavior, see [Advanced configuration: sharing development endpoints among multiple users](#dev-endpoint-sharing).

### Multi-tenancy configuration


**Note**  
Please note, development endpoints are intended to emulate the AWS Glue ETL environment as a single-tenant environment. While multi-tenant use is possible, it is an advanced use-case and it is recommended most users maintain a pattern of single-tenancy for each development endpoint.

In multi tenant use-cases, you might need to take care of resource allocation. The key factor is the number of concurrent users who use a Jupyter notebook at the same time. If your team works in a "follow-the-sun" workflow and there is only one Jupyter user at each time zone, then the number of concurrent users is only one, so you won't need to be concerned with resource allocation. However, if your notebook is shared among multiple users and each user submits code in an ad-hoc basis, then you will need to consider the below points.

To partition Spark cluster resources among multiple users, you can use SparkMagic configurations. There are two different ways to configure SparkMagic.

#### (A) Use the %%configure -f directive


If you want to modify the configuration per Livy session from the notebook, you can run the `%%configure -f` directive on the notebook paragraph.

For example, if you want to run Spark application on 5 executors, you can run the following command on the notebook paragraph.

```
%%configure -f
{"numExecutors":5}
```

Then you will see only 5 executors running for the job on the Spark UI.

We recommend limiting the maximum number of executors for dynamic resource allocation.

```
%%configure -f
{"conf":{"spark.dynamicAllocation.maxExecutors":"5"}}
```

#### (B) Modify the SparkMagic config file


SparkMagic works based on the [Livy API](https://livy.incubator.apache.org/docs/latest/rest-api.html). SparkMagic creates Livy sessions with configurations such as `driverMemory`, ` driverCores`, `executorMemory`, `executorCores`, ` numExecutors`, `conf`, etc. Those are the key factors that determine how much resources are consumed from the entire Spark cluster. SparkMagic allows you to provide a config file to specify those parameters which are sent to Livy. You can see a sample config file in this [Github repository](https://github.com/jupyter-incubator/sparkmagic/blob/master/sparkmagic/example_config.json).

If you want to modify configuration across all the Livy sessions from a notebook, you can modify `/home/ec2-user/.sparkmagic/config.json` to add `session_config` .

To modify the config file on a SageMaker notebook instance, you can follow these steps.

1. Open a SageMaker notebook.

1. Open the Terminal kernel.

1. Run the following commands:

   ```
   sh-4.2$ cd .sparkmagic
   sh-4.2$ ls
   config.json logs
   sh-4.2$ sudo vim config.json
   ```

   For example, you can add these lines to ` /home/ec2-user/.sparkmagic/config.json` and restart the Jupyter kernel from the notebook.

   ```
     "session_configs": {
       "conf": {
         "spark.dynamicAllocation.maxExecutors":"5"
       }
     },
   ```

### Guidelines and best practices


To avoid this kind of resource conflict, you can use some basic approaches like:
+ Have a larger Spark cluster by increasing the `NumberOfWorkers` (scaling horizontally) and upgrading the `workerType` (scaling vertically)
+ Allocate fewer resources per user (fewer resources per Livy session)

Your approach will depend on your use case. If you have a larger development endpoint, and there is not a huge amount of data, the possibility of a resource conflict will decrease significantly because Spark can allocate resources based on a dynamic allocation strategy.

As described above, the number of Spark executors can be automatically calculated based on a combination of DPU (or `NumberOfWorkers`) and worker type. Each Spark application launches one driver and multiple executors. To calculate you will need the ` NumberOfWorkers` = `NumberOfExecutors + 1`. The matrix below explains how much capacity you need in your development endpoint based on the number of concurrent users.


****  

| Number of concurrent notebook users | Number of Spark executors you want to allocate per user | Total NumberOfWorkers for your dev endpoint | 
| --- | --- | --- | 
| 3 | 5 | 18 | 
| 10 | 5 | 60 | 
| 50 | 5 | 300 | 

If you want to allocate fewer resources per user, the ` spark.dynamicAllocation.maxExecutors` (or `numExecutors`) would be the easiest parameter to configure as a Livy session parameter. If you set the below configuration in `/home/ec2-user/.sparkmagic/config.json`, then SparkMagic will assign a maximum of 5 executors per Livy session. This will help segregating resources per Livy session.

```
"session_configs": {
    "conf": {
      "spark.dynamicAllocation.maxExecutors":"5"
    }
  },
```

Suppose there is a dev endpoint with 18 workers (G.1X) and there are 3 concurrent notebook users at the same time. If your session config has ` spark.dynamicAllocation.maxExecutors=5` then each user can make use of 1 driver and 5 executors. There won't be any resource conflicts even when you run multiple notebook paragraphs at the same time.

#### Trade-offs


With this session config `"spark.dynamicAllocation.maxExecutors":"5"`, you will be able to avoid resource conflict errors and you do not need to wait for resource allocation when there are concurrent user accesses. However, even when there are many free resources (for example, there are no other concurrent users), Spark cannot assign more than 5 executors for your Livy session.

#### Other notes


It is a good practice to stop the Jupyter kernel when you stop using a notebook. This will free resources and other notebook users can use those resources immediately without waiting for kernel expiration (auto-shutdown).

### Common issues


Even when following the guidelines, you may experience certain issues.

#### Session not found


When you try to run a notebook paragraph even though your Livy session has been already terminated, you will see the below message. To activate the Livy session, you need to restart the Jupyter kernel by choosing **Kernel** > **Restart** in the Jupyter menu, then run the notebook paragraph again.

```
An error was encountered:
Invalid status code '404' from http://localhost:8998/sessions/13 with error payload: "Session '13' not found."
```

#### Not enough YARN resources


When you try to run a notebook paragraph even though your Spark cluster does not have enough resources to start a new Livy session, you will see the below message. You can often avoid this issue by following the guidelines, however, there might be a possibility that you face this issue. To workaround the issue, you can check if there are any unneeded, active Livy sessions. If there are unneeded Livy sessions, you will need to terminate them to free the cluster resources. See the next section for details.

```
Warning: The Spark session does not have enough YARN resources to start. 
The code failed because of a fatal error:
    Session 16 did not start up in 60 seconds..

Some things to try:
a) Make sure Spark has enough available resources for Jupyter to create a Spark context.
b) Contact your Jupyter administrator to make sure the Spark magics library is configured correctly.
c) Restart the kernel.
```

### Monitoring and debugging


This section describes techniques for monitoring resources and sessions.

#### Monitoring and debugging cluster resource allocation


You can watch the Spark UI to monitor how many resources are allocated per Livy session, and what are the effective Spark configurations on the job. To activate the Spark UI, see [Enabling the Apache Spark Web UI for Development Endpoints](https://docs.aws.amazon.com/glue/latest/dg/monitor-spark-ui-dev-endpoints.html).

(Optional) If you need a real-time view of the Spark UI, you can configure an SSH tunnel against the Spark history server running on the Spark cluster.

```
ssh -i <private-key.pem> -N -L 8157:<development endpoint public address>:18080 glue@<development endpoint public address>
```

You can then open http://localhost:8157 on your browser to view the Spark UI.

#### Free unneeded Livy sessions


Review these procedures to shut down any unneeded Livy sessions from a notebook or a Spark cluster.

**(a). Terminate Livy sessions from a notebook**  
You can shut down the kernel on a Jupyter notebook to terminate unneeded Livy sessions.

**(b). Terminate Livy sessions from a Spark cluster**  
If there are unneeded Livy sessions which are still running, you can shut down the Livy sessions on the Spark cluster.

As a pre-requisite to perform this procedure, you need to configure your SSH public key for your development endpoint.

To log in to the Spark cluster, you can run the following command:

```
$ ssh -i <private-key.pem> glue@<development endpoint public address>
```

You can run the following command to see the active Livy sessions:

```
$ yarn application -list
20/09/25 06:22:21 INFO client.RMProxy: Connecting to ResourceManager at ip-255-1-106-206.ec2.internal/172.38.106.206:8032
Total number of applications (application-types: [] and states: [SUBMITTED, ACCEPTED, RUNNING]):2
Application-Id Application-Name Application-Type User Queue State Final-State Progress Tracking-URL
application_1601003432160_0005 livy-session-4 SPARK livy default RUNNING UNDEFINED 10% http://ip-255-1-4-130.ec2.internal:41867
application_1601003432160_0004 livy-session-3 SPARK livy default RUNNING UNDEFINED 10% http://ip-255-1-179-185.ec2.internal:33727
```

You can then shut down the Livy session with the following command:

```
$ yarn application -kill application_1601003432160_0005
20/09/25 06:23:38 INFO client.RMProxy: Connecting to ResourceManager at ip-255-1-106-206.ec2.internal/255.1.106.206:8032
Killing application application_1601003432160_0005
20/09/25 06:23:39 INFO impl.YarnClientImpl: Killed application application_1601003432160_0005
```

# Managing notebooks
Managing notebooks

**Note**  
 Development Endpoints are only supported for versions of AWS Glue prior to 2.0. For an interactive environment where you can author and test ETL scripts, use [Notebooks on AWS Glue Studio](https://docs.aws.amazon.com/glue/latest/ug/notebooks-chapter.html). 

A notebook enables interactive development and testing of your ETL (extract, transform, and load) scripts on a development endpoint. AWS Glue provides an interface to SageMaker AI Jupyter notebooks. With AWS Glue, you create and manage SageMaker AI notebooks. You can also open SageMaker AI notebooks from the AWS Glue console.

In addition, you can use Apache Spark with SageMaker AI on AWS Glue development endpoints which support SageMaker AI (but not AWS Glue ETL jobs). SageMaker Spark is an open source Apache Spark library for SageMaker AI. For more information, see [Using Apache Spark with Amazon SageMaker](https://docs.aws.amazon.com/sagemaker/latest/dg/apache-spark.html). 


| Region | Code | 
| --- | --- | 
|   Managing SageMaker AI notebooks with AWS Glue development endpoints is available in the following AWS Regions: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/glue/latest/dg/notebooks-with-glue.html)   | 
| US East (Ohio) | `us-east-2` | 
| US East (N. Virginia) | `us-east-1` | 
| US West (N. California) | `us-west-1` | 
| US West (Oregon) | `us-west-2` | 
| Asia Pacific (Tokyo) | `ap-northeast-1` | 
| Asia Pacific (Seoul) | `ap-northeast-2` | 
| Asia Pacific (Mumbai) | `ap-south-1` | 
| Asia Pacific (Singapore) | `ap-southeast-1` | 
| Asia Pacific (Sydney) | `ap-southeast-2` | 
| Canada (Central) | `ca-central-1` | 
| Europe (Frankfurt) | `eu-central-1` | 
| Europe (Ireland) | `eu-west-1` | 
| Europe (London) | `eu-west-2` | 

**Topics**