Amazon Forecast is no longer available to new customers. Existing customers of Amazon Forecast can continue to use the service as normal. [Learn more"](https://aws.amazon.com/blogs/machine-learning/transition-your-amazon-forecast-usage-to-amazon-sagemaker-canvas/)

# Amazon Forecast Algorithms
<a name="aws-forecast-choosing-recipes"></a>

An Amazon Forecast predictor uses an algorithm to train a model with your time series datasets. The trained model is then used to generate metrics and predictions. 

 If you are unsure of which algorithm to use to train your model, choose AutoML when creating a predictor and let Forecast train the optimal model for your datasets. Otherwise, you can manually select one of the Amazon Forecast algorithms. 

**Python notebooks**  
For a step-by-step guide on using AutoML, see [Getting Started with AutoML](https://github.com/aws-samples/amazon-forecast-samples/blob/master/notebooks/advanced/Getting_started_with_AutoML/Getting_started_with_AutoML.ipynb).

## Built-in Forecast Algorithms
<a name="forecast-algos"></a>

 Amazon Forecast provides six built-in algorithms for you to choose from. These range from commonly used statistical algorithms like Autoregressive Integrated Moving Average (ARIMA), to complex neural network algorithms like CNN-QR and DeepAR\$1. 

### [CNN-QR](aws-forecast-algo-cnnqr.md)
<a name="cnnqr"></a>

 `arn:aws:forecast:::algorithm/CNN-QR` 

 Amazon Forecast CNN-QR, Convolutional Neural Network - Quantile Regression, is a proprietary machine learning algorithm for forecasting time series using causal convolutional neural networks (CNNs). CNN-QR works best with large datasets containing hundreds of time series. It accepts item metadata, and is the only Forecast algorithm that accepts related time series data without future values. 

### [DeepAR\$1](aws-forecast-recipe-deeparplus.md)
<a name="deeparplus"></a>

`arn:aws:forecast:::algorithm/Deep_AR_Plus`

 Amazon Forecast DeepAR\$1 is a proprietary machine learning algorithm for forecasting time series using recurrent neural networks (RNNs). DeepAR\$1 works best with large datasets containing hundreds of feature time series. The algorithm accepts forward-looking related time series and item metadata. 

### [Prophet](aws-forecast-recipe-prophet.md)
<a name="prophet"></a>

`arn:aws:forecast:::algorithm/Prophet`

 Prophet is a time series forecasting algorithm based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality. It works best with time series with strong seasonal effects and several seasons of historical data. 

### [NPTS](aws-forecast-recipe-npts.md)
<a name="npts"></a>

`arn:aws:forecast:::algorithm/NPTS`

 The Amazon Forecast Non-Parametric Time Series (NPTS) proprietary algorithm is a scalable, probabilistic baseline forecaster. NPTS is especially useful when working with sparse or intermittent time series. Forecast provides four algorithm variants: Standard NPTS, Seasonal NPTS, Climatological Forecaster, and Seasonal Climatological Forecaster. 

### [ARIMA](aws-forecast-recipe-arima.md)
<a name="arima"></a>

`arn:aws:forecast:::algorithm/ARIMA`

 Autoregressive Integrated Moving Average (ARIMA) is a commonly used statistical algorithm for time-series forecasting. The algorithm is especially useful for simple datasets with under 100 time series. 

### [ETS](aws-forecast-recipe-ets.md)
<a name="ets"></a>

`arn:aws:forecast:::algorithm/ETS`

 Exponential Smoothing (ETS) is a commonly used statistical algorithm for time-series forecasting. The algorithm is especially useful for simple datasets with under 100 time series, and datasets with seasonality patterns. ETS computes a weighted average over all observations in the time series dataset as its prediction, with exponentially decreasing weights over time. 

## Comparing Forecast Algorithms
<a name="comparing-algos"></a>

 Use the following table to find the best option for your time series datasets. 


|  | Neural Networks | Flexible Local Algorithms | Baseline Algorithms |  | CNN-QR | DeepAR\$1 | Prophet | NPTS | ARIMA | ETS | 
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | 
| Computationally intensive training process | High | High | Medium | Low | Low | Low | 
| Accepts historical related time series\$1 | ![\[Yes\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-yes.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | 
| Accepts forward-looking related time series\$1 | ![\[Yes\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-yes.png)  | ![\[Yes\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-yes.png)  | ![\[Yes\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-yes.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | 
| Accepts item metadata (product color, brand, etc) | ![\[Yes\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-yes.png)  | ![\[Yes\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-yes.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | 
| Accepts the Weather Index built-in featurization | ![\[Yes\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-yes.png)  | ![\[Yes\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-yes.png)  | ![\[Yes\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-yes.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | 
| Suitable for sparse datasets | ![\[Yes\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-yes.png)  | ![\[Yes\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-yes.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | ![\[Yes\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-yes.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | 
| Performs Hyperparameter Optimization (HPO) | ![\[Yes\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-yes.png)  | ![\[Yes\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-yes.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | 
| Allows overriding default hyperparameter values  | ![\[Yes\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-yes.png)  | ![\[Yes\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-yes.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | ![\[Yes\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-yes.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | ![\[No\]](http://docs.aws.amazon.com/forecast/latest/dg/images/icon-no.png)  | 

\$1For more information on related time series, see [Related Time Series](related-time-series-datasets.md). 

# Autoregressive Integrated Moving Average (ARIMA) Algorithm
<a name="aws-forecast-recipe-arima"></a>

Autoregressive Integrated Moving Average ([ARIMA](https://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average)) is a commonly-used local statistical algorithm for time-series forecasting. ARIMA captures standard temporal structures (patterned organizations of time) in the input dataset. The Amazon Forecast ARIMA algorithm calls the [Arima function](https://cran.r-project.org/web/packages/forecast/forecast.pdf#Rfn.Arima.1) in the `Package 'forecast'` of the Comprehensive R Archive Network (CRAN).

## How ARIMA Works
<a name="aws-forecast-recipe-arima-how-it-works"></a>

The ARIMA algorithm is especially useful for datasets that can be mapped to stationary time series. The statistical properties of stationary time series, such as autocorrelations, are independent of time. Datasets with stationary time series usually contain a combination of signal and noise. The signal may exhibit a pattern of sinusoidal oscillation or have a seasonal component. ARIMA acts like a filter to separate the signal from the noise, and then extrapolates the signal in the future to make predictions.

## ARIMA Hyperparameters and Tuning
<a name="aws-forecast-recipe-arima-hyperparamters"></a>

For information about ARIMA hyperparameters and tuning, see the `Arima` function documentation in the [Package 'forecast'](https://cran.r-project.org/web/packages/forecast/forecast.pdf) of [CRAN](https://cran.r-project.org).

Amazon Forecast converts the `DataFrequency` parameter specified in the [CreateDataset](API_CreateDataset.md) operation to the `frequency` parameter of the R [ts](https://www.rdocumentation.org/packages/stats/versions/3.6.1/topics/ts) function using the following table:


| DataFrequency (string) | R ts frequency (integer) | 
| --- | --- | 
| Y | 1 | 
| M | 12 | 
| W | 52 | 
| D | 7 | 
| H | 24 | 
| 30min | 2 | 
| 15min | 4 | 
| 10min | 6 | 
| 5min | 12 | 
| 1min | 60 | 

For frequencies less than 24 or short time series, the hyperparameters are set using the `auto.arima` function of the `Package 'forecast'` of [CRAN](https://cran.r-project.org). For frequencies greater than or equal to 24 and long time series, we use a Fourier series with K = 4, as described here, [Forecasting with long seasonal periods](https://robjhyndman.com/hyndsight/longseasonality/).

Supported data frequencies that aren't in the table default to a `ts` frequency of 1.

# CNN-QR Algorithm
<a name="aws-forecast-algo-cnnqr"></a>

 Amazon Forecast CNN-QR, Convolutional Neural Network - Quantile Regression, is a proprietary machine learning algorithm for forecasting scalar (one-dimensional) time series using causal convolutional neural networks (CNNs). This supervised learning algorithm trains one global model from a large collection of time series and uses a quantile decoder to make probabilistic predictions.

**Topics**
+ [Getting Started with CNN-QR](#aws-forecast-algo-cnnqr-getting-started)
+ [How CNN-QR Works](#aws-forecast-algo-cnnqr-how-it-works)
+ [Using Related Data with CNN-QR](#aws-forecast-algo-cnnqr-using-rts)
+ [CNN-QR Hyperparameters](#aws-forecast-algo-cnnqr-hyperparameters)
+ [Tips and Best Practices](#aws-forecast-algo-cnnqr-tips)

## Getting Started with CNN-QR
<a name="aws-forecast-algo-cnnqr-getting-started"></a>

 You can train a predictor with CNN-QR in two ways: 

1. Manually selecting the CNN-QR algorithm.

1. Choosing AutoML (CNN-QR is part of AutoML).

 If you are unsure of which algorithm to use, we recommend selecting AutoML, and Forecast will select CNN-QR if it is the most accurate algorithm for your data. To see if CNN-QR was selected as the most accurate model, either use the [DescribePredictor](https://docs.aws.amazon.com/forecast/latest/dg/API_DescribePredictor.html) API or choose the predictor name in the console. 

Here are some key use cases for CNN-QR: 
+  **Forecast with large and complex datasets** - CNN-QR works best when trained with large and complex datasets. The neural network can learn across many datasets, which is useful when you have related time series and item metadata.
+  **Forecast with historical related time series** - CNN-QR does not require related time series to contain data points within the forecast horizon. This added flexibility allows you to include a broader range of related time series and item meta data, such as item price, events, web metrics, and product categories. 

## How CNN-QR Works
<a name="aws-forecast-algo-cnnqr-how-it-works"></a>

CNN-QR is a sequence-to-sequence (Seq2Seq) model for probabilistic forecasting that tests how well a prediction reconstructs the decoding sequence, conditioned on the encoding sequence. 

The algorithm allows for different features in the encoding and the decoding sequences, so you can use a related time series in the encoder, and omit it from the decoder (and vice versa). By default, related time series with data points in the forecast horizon will be included in both the encoder and decoder. Related time series without data points in the forecast horizon will only be included in the encoder. 

CNN-QR performs quantile regression with a hierarchical causal CNN serving as a learnable feature extractor. 

To facilitate learning time-dependent patterns, such as spikes during weekends, CNN-QR automatically creates feature time series based on time-series granularity. For example, CNN-QR creates two feature time series (day-of-month and day-of-year) at a weekly time-series frequency. The algorithm uses these derived feature time series along with the custom feature time series provided during training and inference. The following example shows a target time series, `zi,t`, and two derived time-series features: `ui,1,t` represents the hour of the day, and `ui,2,t` represents the day of the week. 

![\[\]](http://docs.aws.amazon.com/forecast/latest/dg/images/cnnqr-time-frequencies.PNG)


CNN-QR automatically includes these feature time series based on the data frequency and the size of training data. The following table lists the features that can be derived for each supported basic time frequency. 


****  

| Frequency of the Time Series | Derived Features | 
| --- | --- | 
| Minute | minute-of-hour, hour-of-day, day-of-week, day-of-month, day-of-year | 
| Hour | hour-of-day, day-of-week, day-of-month, day-of-year | 
| Day | day-of-week, day-of-month, day-of-year | 
| Week | week-of-month, week-of-year | 
| Month | month-of-year | 

During training, each time series in the training dataset consists of a pair of adjacent context and forecast windows with fixed predefined lengths. This is shown in the figure below, where the context window is represented in green, and the forecast window is represented in blue. 

You can use a model trained on a given training set to generate predictions for time series in the training set, and for other time series. The training dataset consists of a target time series, which may be associated with a list of related time series and item metadata. 

The figure below shows how this works for an element of a training dataset indexed by `i`. The training dataset consists of a target time series, `zi,t`, and two associated related time series, `xi,1,t` and `xi,2,t`. The first related time series, `xi,1,t`, is a forward-looking time series, and the second, `xi,2,t`, is a historical time series. 

![\[\]](http://docs.aws.amazon.com/forecast/latest/dg/images/cnnqr-short-long-rts.png)


CNN-QR learns across the target time series, `zi,t`, and the related time series, `xi,1,t` and `xi,2,t`, to generate predictions in the forecast window, represented by the orange line. 

## Using Related Data with CNN-QR
<a name="aws-forecast-algo-cnnqr-using-rts"></a>

 CNNQR supports both historical and forward looking related time series datasets. If you provide a forward looking related time series dataset, any missing value will be filled using the [future filling method](howitworks-missing-values.md). For more information on historical and forward-looking related time series, see [Using Related Time Series Datasets](related-time-series-datasets.md). 

You can also use item metadata datasets with CNN-QR. These are datasets with static information on the items in your target time series. Item metadata is especially useful for coldstart forecasting scenarios where there is little to no historical data. For more information on item metadata, see [Item Metadata.](item-metadata-datasets.md)

## CNN-QR Hyperparameters
<a name="aws-forecast-algo-cnnqr-hyperparameters"></a>

 Amazon Forecast optimizes CNN-QR models on selected hyperparameters. When manually selecting CNN-QR, you have the option to pass in training parameters for these hyperparameters. The following table lists the tunable hyperparameters of the CNN-QR algorithm. 


| Parameter Name | Values | Description | 
| --- | --- | --- | 
| context\$1length |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/forecast/latest/dg/aws-forecast-algo-cnnqr.html)  |  The number of time points that the model reads before making predictions. Typically, CNN-QR has larger values for `context_length` than DeepAR\$1 because CNN-QR does not use lags to look at further historical data. If the value for `context_length` is outside of a predefined range, CNN-QR will automatically set the default `context_length` to an appropriate value.  | 
| use\$1related\$1data |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/forecast/latest/dg/aws-forecast-algo-cnnqr.html)  |  Determines which kinds of related time series data to include in the model. Choose one of four options: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/forecast/latest/dg/aws-forecast-algo-cnnqr.html) `HISTORICAL` includes all historical related time series, and `FORWARD_LOOKING` includes all forward-looking related time series. You cannot choose a subset of `HISTORICAL` or `FORWARD_LOOKING` related time series.   | 
| use\$1item\$1metadata |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/forecast/latest/dg/aws-forecast-algo-cnnqr.html)  |  Determines whether the model includes item metadata.  Choose one of two options: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/forecast/latest/dg/aws-forecast-algo-cnnqr.html) `use_item_metadata` includes either all provided item metadata or none. You cannot choose a subset of item metadata.   | 
| epochs |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/forecast/latest/dg/aws-forecast-algo-cnnqr.html)  |  The maximum number of complete passes through the training data. Smaller datasets require more epochs.  For large values of `ForecastHorizon` and `context_length`, consider decreasing epochs to improve the training time.   | 

### Hyperparameter Optimization (HPO)
<a name="aws-forecast-algo-cnnqr-hpo"></a>

Hyperparameter optimization (HPO) is the task of selecting the optimal hyperparameter values for a specific learning objective. With Forecast, you can automate this process in two ways: 

1. Choosing AutoML, and HPO will automatically run for CNN-QR.

1. Manually selecting CNN-QR and setting `PerformHPO = TRUE`.

Additional related time series and item metadata does not always improve the accuracy of your CNN-QR model. When you run AutoML or enable HPO, CNN-QR tests the accuracy of your model with and without the provided related time series and item metadata, and selects the model with the highest accuracy.

Amazon Forecast automatically optimizes the following three hyperparameters during HPO and provides you with the final trained values:
+ **context\$1length** - determines how far into the past the network can see. The HPO process automatically sets a value for `context_length` that maximizes model accuracy, while taking training time into account.
+ **use\$1related\$1data** - determines which forms of related time series data to include in your model. The HPO process automatically checks whether your related time series data improves the model, and selects the optimal setting.
+ **use\$1item\$1metadata** - determines whether to include item metadata in your model. The HPO process automatically checks whether your item metadata improves the model, and chooses the optimal setting.

**Note**  
If `use_related_data` is set to `NONE` or `HISTORICAL` when the `Holiday` supplementary feature is selected, this means that including holiday data does not improve model accuracy.

You can set the HPO configuration for the `context_length` hyperparameter if you set `PerformHPO = TRUE` during manual selection. However, you cannot alter any aspect of the HPO configuration if you choose AutoML. For more information on HPO configuration, refer to the [IntergerParameterRange](https://docs.aws.amazon.com/forecast/latest/dg/API_IntegerParameterRange.html) API. 

## Tips and Best Practices
<a name="aws-forecast-algo-cnnqr-tips"></a>

 **Avoid large values for ForecastHorizon** - Using values over 100 for the `ForecastHorizon` will increase training time and can reduce model accuracy. If you want to forecast further into the future, consider aggregating to a higher frequency. For example, use `5min` instead of `1min`. 

 **CNNs allow for a higher context length** - With CNN-QR, you can set the `context_length` slightly higher than that for DeepAR\$1, as CNNs are generally more efficient than RNNs. 

 **Feature engineering of related data** - Experiment with different combinations of related time series and item metadata when training your model, and assess whether the additional information improves accuracy. Different combinations and transformations of related time series and item metadata will deliver different results.

 **CNN-QR does not forecast at the mean quantile ** – When you set `ForecastTypes` to `mean` with the [ CreateForecast](https://docs.aws.amazon.com/forecast/latest/dg/API_CreateForecast.html) API, forecasts will instead be generated at the median quantile (`0.5` or `P50`). 

# DeepAR\$1 Algorithm
<a name="aws-forecast-recipe-deeparplus"></a>

Amazon Forecast DeepAR\$1 is a supervised learning algorithm for forecasting scalar (one-dimensional) time series using recurrent neural networks (RNNs). Classical forecasting methods, such as autoregressive integrated moving average (ARIMA) or exponential smoothing (ETS), fit a single model to each individual time series, and then use that model to extrapolate the time series into the future. In many applications, however, you have many similar time series across a set of cross-sectional units. These time-series groupings demand different products, server loads, and requests for web pages. In this case, it can be beneficial to train a single model jointly over all of the time series. DeepAR\$1 takes this approach. When your dataset contains hundreds of feature time series, the DeepAR\$1 algorithm outperforms the standard ARIMA and ETS methods. You can also use the trained model for generating forecasts for new time series that are similar to the ones it has been trained on.

**Python notebooks**  
For a step-by-step guide on using the DeepAR\$1 algorithm, see [Getting Started with DeepAR\$1](https://github.com/aws-samples/amazon-forecast-samples/blob/master/notebooks/advanced/Getting_started_with_DeepAR%2B/Getting_started_with_DeepAR%2B.ipynb).

**Topics**
+ [How DeepAR\$1 Works](#aws-forecast-recipe-deeparplus-how-it-works)
+ [DeepAR\$1 Hyperparameters](#aws-forecast-recipe-deeparplus-hyperparameters)
+ [Tune DeepAR\$1 Models](#aws-forecast-recipe-deeparplus-tune-model)

## How DeepAR\$1 Works
<a name="aws-forecast-recipe-deeparplus-how-it-works"></a>

During training, DeepAR\$1 uses a training dataset and an optional testing dataset. It uses the testing dataset to evaluate the trained model. In general, the training and testing datasets don't have to contain the same set of time series. You can use a model trained on a given training set to generate forecasts for the future of the time series in the training set, and for other time series. Both the training and the testing datasets consist of (preferably more than one) target time series. Optionally, they can be associated with a vector of feature time series and a vector of categorical features (for details, see [DeepAR Input/Output Interface](https://docs.aws.amazon.com/sagemaker/latest/dg/deepar.html#deepar-inputoutput) in the *SageMaker AI Developer Guide*). The following example shows how this works for an element of a training dataset indexed by `i`. The training dataset consists of a target time series, `zi,t`, and two associated feature time series, `xi,1,t` and `xi,2,t`.

![\[\]](http://docs.aws.amazon.com/forecast/latest/dg/images/forecast-recipe-deeparplus-ts-full-159.base.png)


The target time series might contain missing values (denoted in the graphs by breaks in the time series). DeepAR\$1 supports only feature time series that are known in the future. This allows you to run counterfactual "what-if" scenarios. For example, "What happens if I change the price of a product in some way?" 

Each target time series can also be associated with a number of categorical features. You can use these to encode that a time series belongs to certain groupings. Using categorical features allows the model to learn typical behavior for those groupings, which can increase accuracy. A model implements this by learning an embedding vector for each group that captures the common properties of all time series in the group. 

To facilitate learning time-dependent patterns, such as spikes during weekends, DeepAR\$1 automatically creates feature time series based on time-series granularity. For example, DeepAR\$1 creates two feature time series (day of the month and day of the year) at a weekly time-series frequency. It uses these derived feature time series along with the custom feature time series that you provide during training and inference. The following example shows two derived time-series features: `ui,1,t` represents the hour of the day, and `ui,2,t` the day of the week. 

![\[\]](http://docs.aws.amazon.com/forecast/latest/dg/images/forecast-recipe-deeparplus-ts-full-159.derived.png)


DeepAR\$1 automatically includes these feature time series based on the data frequency and the size of training data. The following table lists the features that can be derived for each supported basic time frequency. 


****  

| Frequency of the Time Series | Derived Features | 
| --- | --- | 
| Minute | minute-of-hour, hour-of-day, day-of-week, day-of-month, day-of-year | 
| Hour | hour-of-day, day-of-week, day-of-month, day-of-year | 
| Day | day-of-week, day-of-month, day-of-year | 
| Week | week-of-month, week-of-year | 
| Month | month-of-year | 

A DeepAR\$1 model is trained by randomly sampling several training examples from each of the time series in the training dataset. Each training example consists of a pair of adjacent context and prediction windows with fixed predefined lengths. The `context_length` hyperparameter controls how far in the past the network can see, and the `ForecastHorizon` parameter controls how far in the future predictions can be made. During training, Amazon Forecast ignores elements in the training dataset with time series shorter than the specified prediction length. The following example shows five samples, with a context length (highlighted in green) of 12 hours and a prediction length (highlighted in blue) of 6 hours, drawn from element `i`. For the sake of brevity, we've excluded the feature time series `xi,1,t` and `ui,2,t`.

![\[\]](http://docs.aws.amazon.com/forecast/latest/dg/images/forecast-recipe-deeparplus-ts-full-159.sampled.png)


To capture seasonality patterns, DeepAR\$1 also automatically feeds lagged (past period) values from the target time series. In our example with samples taken at an hourly frequency, for each time index `t = T`, the model exposes the `zi,t` values which occurred approximately one, two, and three days in the past (highlighted in pink).

![\[\]](http://docs.aws.amazon.com/forecast/latest/dg/images/forecast-recipe-deeparplus-ts-full-159.lags.png)


For inference, the trained model takes as input the target time series, which might or might not have been used during training, and forecasts a probability distribution for the next `ForecastHorizon` values. Because DeepAR\$1 is trained on the entire dataset, the forecast takes into account learned patterns from similar time series.

For information on the mathematics behind DeepAR\$1, see [DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks](https://arxiv.org/abs/1704.04110) on the Cornell University Library website. 

## DeepAR\$1 Hyperparameters
<a name="aws-forecast-recipe-deeparplus-hyperparameters"></a>

The following table lists the hyperparameters that you can use in the DeepAR\$1 algorithm. Parameters in bold participate in hyperparameter optimization (HPO).


| Parameter Name | Description | 
| --- | --- | 
| context\$1length |  The number of time points that the model reads in before making the prediction. The value for this parameter should be about the same as the `ForecastHorizon`. The model also receives lagged inputs from the target, so `context_length` can be much smaller than typical seasonalities. For example, a daily time series can have yearly seasonality. The model automatically includes a lag of one year, so the context length can be shorter than a year. The lag values that the model picks depend on the frequency of the time series. For example, lag values for daily frequency are: previous week, 2 weeks, 3 weeks, 4 weeks, and year. [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/forecast/latest/dg/aws-forecast-recipe-deeparplus.html)  | 
| epochs |  The maximum number of passes to go over the training data. The optimal value depends on your data size and learning rate. Smaller datasets and lower learning rates both require more epochs, to achieve good results. [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/forecast/latest/dg/aws-forecast-recipe-deeparplus.html)  | 
| learning\$1rate |  The learning rate used in training. [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/forecast/latest/dg/aws-forecast-recipe-deeparplus.html)  | 
| learning\$1rate\$1decay |  The rate at which the learning rate decreases. At most, the learning rate is reduced `max_learning_rate_decays` times, then training stops. This parameter will be used only if `max_learning_rate_decays` is greater than 0. [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/forecast/latest/dg/aws-forecast-recipe-deeparplus.html)  | 
| likelihood |  The model generates a probabilistic forecast, and can provide quantiles of the distribution and return samples. Depending on your data, choose an appropriate likelihood (noise model) that is used for uncertainty estimates. Valid values [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/forecast/latest/dg/aws-forecast-recipe-deeparplus.html) [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/forecast/latest/dg/aws-forecast-recipe-deeparplus.html)  | 
| max\$1learning\$1rate\$1decays |  The maximum number of learning rate reductions that should occur. [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/forecast/latest/dg/aws-forecast-recipe-deeparplus.html) | 
| num\$1averaged\$1models |  In DeepAR\$1, a training trajectory can encounter multiple models. Each model might have different forecasting strengths and weaknesses. DeepAR\$1 can average the model behaviors to take advantage of the strengths of all models. [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/forecast/latest/dg/aws-forecast-recipe-deeparplus.html)  | 
| num\$1cells |  The number of cells to use in each hidden layer of the RNN. [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/forecast/latest/dg/aws-forecast-recipe-deeparplus.html)  | 
| num\$1layers |  The number of hidden layers in the RNN. [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/forecast/latest/dg/aws-forecast-recipe-deeparplus.html)  | 

## Tune DeepAR\$1 Models
<a name="aws-forecast-recipe-deeparplus-tune-model"></a>

To tune Amazon Forecast DeepAR\$1 models, follow these recommendations for optimizing the training process and hardware configuration. 

### Best Practices for Process Optimization
<a name="aws-forecast-recipe-deeparplus-best-practices"></a>

 To achieve the best results, follow these recommendations: 
+ Except when splitting the training and testing datasets, always provide entire time series for training and testing, and when calling the model for inference. Regardless of how you set `context_length`, don't divide the time series or provide only a part of it. The model will use data points further back than `context_length` for the lagged values feature.
+ For model tuning, you can split the dataset into training and testing datasets. In a typical evaluation scenario, you should test the model on the same time series used in training, but on the future `ForecastHorizon` time points immediately after the last time point visible during training. To create training and testing datasets that satisfy these criteria, use the entire dataset (all of the time series) as a testing dataset and remove the last `ForecastHorizon` points from each time series for training. This way, during training, the model doesn't see the target values for time points on which it is evaluated during testing. In the test phase, the last `ForecastHorizon` points of each time series in the testing dataset are withheld and a prediction is generated. The forecast is then compared with the actual values for the last `ForecastHorizon` points. You can create more complex evaluations by repeating time series multiple times in the testing dataset, but cutting them off at different end points. This produces accuracy metrics that are averaged over multiple forecasts from different time points.
+ Avoid using very large values (> 400) for the `ForecastHorizon` because this slows down the model and makes it less accurate. If you want to forecast further into the future, consider aggregating to a higher frequency. For example, use `5min` instead of `1min`.
+ Because of lags, the model can look further back than `context_length`. Therefore, you don't have to set this parameter to a large value. A good starting point for this parameter is the same value as the `ForecastHorizon`.
+ Train DeepAR\$1 models with as many time series as are available. Although a DeepAR\$1 model trained on a single time series might already work well, standard forecasting methods such as ARIMA or ETS might be more accurate and are more tailored to this use case. DeepAR\$1 starts to outperform the standard methods when your dataset contains hundreds of feature time series. Currently, DeepAR\$1 requires that the total number of observations available, across all training time series, is at least 300.

# Exponential Smoothing (ETS) Algorithm
<a name="aws-forecast-recipe-ets"></a>

Exponential Smoothing [(ETS)](https://en.wikipedia.org/wiki/Exponential_smoothing) is a commonly-used local statistical algorithm for time-series forecasting. The Amazon Forecast ETS algorithm calls the [ets function](https://cran.r-project.org/web/packages/forecast/forecast.pdf#Rfn.ets.1) in the `Package 'forecast'` of the Comprehensive R Archive Network (CRAN).

## How ETS Works
<a name="aws-forecast-recipe-ets-how-it-works"></a>

The ETS algorithm is especially useful for datasets with seasonality and other prior assumptions about the data. ETS computes a weighted average over all observations in the input time series dataset as its prediction. The weights are exponentially decreasing over time, rather than the constant weights in simple moving average methods. The weights are dependent on a constant parameter, which is known as the smoothing parameter.

## ETS Hyperparameters and Tuning
<a name="aws-forecast-recipe-ets-hyperparamters"></a>

For information about ETS hyperparameters and tuning, see the `ets` function documentation in the [Package 'forecast'](https://cran.r-project.org/web/packages/forecast/forecast.pdf) of [CRAN](https://cran.r-project.org).

Amazon Forecast converts the `DataFrequency` parameter specified in the [CreateDataset](API_CreateDataset.md) operation to the `frequency` parameter of the R [ts](https://www.rdocumentation.org/packages/stats/versions/3.6.1/topics/ts) function using the following table:


| DataFrequency (string) | R ts frequency (integer) | 
| --- | --- | 
| Y | 1 | 
| M | 12 | 
| W | 52 | 
| D | 7 | 
| H | 24 | 
| 30min | 2 | 
| 15min | 4 | 
| 10min | 6 | 
| 5min | 12 | 
| 1min | 60 | 

Supported data frequencies that aren't in the table default to a `ts` frequency of 1.

# Non-Parametric Time Series (NPTS) Algorithm
<a name="aws-forecast-recipe-npts"></a>

The Amazon Forecast Non-Parametric Time Series (NPTS) algorithm is a scalable, probabilistic baseline forecaster. It predicts the future value distribution of a given time series by sampling from past observations. The predictions are bounded by the observed values. NPTS is especially useful when the time series is intermittent (or sparse, containing many 0s) and bursty. For example, forecasting demand for individual items where the time series has many low counts. Amazon Forecast provides variants of NPTS that differ in which of the past observations are sampled and how they are sampled. To use an NPTS variant, you choose a hyperparameter setting.

## How NPTS Works
<a name="aws-forecast-recipe-npts-how-it-works"></a>

Similar to classical forecasting methods, such as exponential smoothing (ETS) and autoregressive integrated moving average (ARIMA), NPTS generates predictions for each time series individually. The time series in the dataset can have different lengths. The time points where the observations are available are called the training range and the time points where the prediction is desired are called the prediction range.

Amazon Forecast NPTS forecasters have the following variants: NPTS, seasonal NPTS, climatological forecaster, and seasonal climatological forecaster.

**Topics**
+ [NPTS](#aws-forecast-recipe-npts-variants-npts)
+ [Seasonal NPTS](#aws-forecast-recipe-npts-variants-seasonal)
+ [Climatological Forecaster](#aws-forecast-recipe-npts-variants-climatological)
+ [Seasonal Climatological Forecaster](#aws-forecast-recipe-npts-variants-seasonal-climatological)
+ [Seasonal Features](#aws-forecast-recipe-npts-seasonal-features)
+ [Best Practices](#aws-forecast-recipe-npts-recommended-practices)

### NPTS
<a name="aws-forecast-recipe-npts-variants-npts"></a>

In this variant, predictions are generated by sampling from all observations in the training range of the time series. However, instead of uniformly sampling from all of the observations, this variant assigns weight to each of the past observations according to how far it is from the current time step where the prediction is needed. In particular, it uses weights that decay exponentially according to the distance of the past observations. In this way, the observations from the recent past are sampled with much higher probability than the observations from the distant past. This assumes that the near past is more indicative for the future than the distant past. You can control the amount of decay in the weights with the `exp_kernel_weights` hyperparameter.

To use this NPTS variant in Amazon Forecast, set the `use_seasonal_model` hyperparameter to `False` and accept all other default settings.

### Seasonal NPTS
<a name="aws-forecast-recipe-npts-variants-seasonal"></a>

The seasonal NPTS variant is similar to NPTS except that instead of sampling from all of the observations, it uses only the observations from the past *seasons *. By default, the season is determined by the granularity of the time series. For example, for an hourly time series, to predict for hour *t*, this variant samples from the observations corresponding to the hour *t* on the previous days. Similar to NPTS, observation at hour *t* on the previous day is given more weight than the observations at hour *t* on earlier days. For more information about how to determine seasonality based on the granularity of the time series, see [Seasonal Features](#aws-forecast-recipe-npts-seasonal-features).

### Climatological Forecaster
<a name="aws-forecast-recipe-npts-variants-climatological"></a>

The climatological forecaster variant samples all of the past observations with uniform probability. 

To use the climatological forecaster, set the `kernel_type` hyperparameter to `uniform` and the `use_seasonal_model` hyperparameter to `False`. Accept the default settings for all other hyperparameters.

### Seasonal Climatological Forecaster
<a name="aws-forecast-recipe-npts-variants-seasonal-climatological"></a>

Similar to seasonal NPTS, the seasonal climatological forecaster samples the observations from past seasons, but samples them with uniform probability. 

To use the seasonal climatological forecaster, set the `kernel_type` hyperparameter to `uniform`. Accept all other default settings for all of the other hyperparameters.

### Seasonal Features
<a name="aws-forecast-recipe-npts-seasonal-features"></a>

To determine what corresponds to a season for the seasonal NPTS and seasonal climatological forecaster, use the features listed in the following table. The table lists the derived features for the supported basic time frequencies, based on granularity. Amazon Forecast includes these feature time series, so you don't have to provide them.


****  

| Frequency of the Time Series | Feature to Determine Seasonality | 
| --- | --- | 
| Minute | minute-of-hour | 
| Hour | hour-of-day | 
| Day | day-of-week | 
| Week | day-of-month | 
| Month | month-of-year | 

### Best Practices
<a name="aws-forecast-recipe-npts-recommended-practices"></a>

When using the Amazon Forecast NPTS algorithms, consider the following best practices for preparing the data and achieving optimal results:
+ Because NPTS generates predictions for each time series individually, provide the entire time series when calling the model for prediction. Also, accept the default value of the `context_length` hyperparameter. This causes the algorithm to use the entire time series. 
+  If you change the `context_length` (because the training data is too long), make sure it is large enough and covers multiple past seasons. For example, for a daily time series, this value must be at least 365 days (provided that you have that amount of data). 

## NPTS Hyperparameters
<a name="aws-forecast-recipe-npts-hyperparamters"></a>

The following table lists the hyperparameters that you can use in the NPTS algorithm.


| Parameter Name | Description | 
| --- | --- | 
| context\$1length | The number of time-points in the past that the model uses for making the prediction. By default, it uses all of the time points in the training range. Typically, the value for this hyperparameter should be large and should cover multiple past seasons. For example, for the daily time series this value must be at least 365 days. [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/forecast/latest/dg/aws-forecast-recipe-npts.html)  | 
| kernel\$1type | The kernel to use to define the weights used for sampling past observations. [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/forecast/latest/dg/aws-forecast-recipe-npts.html)  | 
| exp\$1kernel\$1weights |  Valid only when `kernel_type` is `exponential`. The scaling parameter of the kernel. For faster (exponential) decay in the weights given to the observations in the distant past, use a large value. [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/forecast/latest/dg/aws-forecast-recipe-npts.html)  | 
| use\$1seasonal\$1model | Whether to use a seasonal variant. [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/forecast/latest/dg/aws-forecast-recipe-npts.html)  | 
| use\$1default\$1time\$1features |  Valid only for the *seasonal NPTS* and *seasonal climatological forecaster* variants. Whether to use seasonal features based on the granularity of the time series to determine seasonality. [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/forecast/latest/dg/aws-forecast-recipe-npts.html)  | 

# Prophet Algorithm
<a name="aws-forecast-recipe-prophet"></a>

[Prophet](https://facebook.github.io/prophet/) is a popular local Bayesian structural time series model. The Amazon Forecast Prophet algorithm uses the [Prophet class](https://facebook.github.io/prophet/docs/quick_start.html#python-ap) of the Python implementation of Prophet.

## How Prophet Works
<a name="aws-forecast-recipe-prophet-how-it-works"></a>

Prophet is especially useful for datasets that:
+ Contain an extended time period (months or years) of detailed historical observations (hourly, daily, or weekly)
+ Have multiple strong seasonalities
+ Include previously known important, but irregular, events
+ Have missing data points or large outliers
+ Have non-linear growth trends that are approaching a limit

Prophet is an additive regression model with a piecewise linear or logistic growth curve trend. It includes a yearly seasonal component modeled using Fourier series and a weekly seasonal component modeled using dummy variables.

For more information, see [Prophet: forecasting at scale](https://research.facebook.com/blog/2017/2/prophet-forecasting-at-scale/).

## Prophet Hyperparameters and Related Time Series
<a name="aws-forecast-recipe-prophet-hyperparamters"></a>

Amazon Forecast uses the default Prophet [hyperparameters](https://facebook.github.io/prophet/docs/quick_start.html#python-ap). Prophet also supports related time-series as features, provided to Amazon Forecast in the related time-series CSV file.